Phasing of de novo mutations using a scaled-up multiple amplicon
long-read sequencing approach
Abstract
De novo mutations (DNMs) play an important role in severe genetic
disorders that reduce fitness. To better understand the role of DNMs in
disease, it is important to determine the parent-of-origin and timing of
the mutational events that give rise to the mutations, especially in
sex-specific developmental disorders such as male infertility. However,
currently available short-read sequencing approaches are not ideally
suited for phasing as this requires long continuous DNA strands that
span both the DNM and one or more informative SNPs. To overcome these
challenges, we optimised and implemented a multiplexed long-read
sequencing approach using the Oxford Nanopore technologies MinION
platform. We specifically focused on improving target amplification,
integrating long-read sequenced data with high-quality short-read
sequence data, and developing an anchored phasing computational method.
This approach was able to handle the inherent phasing challenges that
arise from long-range target amplification and the normal accumulation
of sequencing error associated with long-read sequencing. In total, 77
out of 109 DNMs (71%) were successfully phased and parent-of-origin
identified. The majority of phased DNMs were prezygotic (90%), the
accuracy of which is highlighted by the average mutant allele frequency
of 49.6% and a standard error margin of 0.84%. This study demonstrates
the benefits of using an integrated short-read and long-read sequencing
approach for large-scale DNM phasing.