Bioinformatics and interpretation of variants identified using ES.
In the PAGE study, sequenced data were assessed for candidate pathogenic variants from a modified gene list of genes (DDG2P list (1628 genes)27) that were likely to be associated with developmental disorders37. These were selected as they were identified as rare, protein altering variants in which the inheritance pattern of the variant matched that of the gene being assessed for clinical review (by bioinformative filtering). These ‘candidate pathologic variants’ were then fed onto and reviewed by the clinical review panel (CRP). The CRP (a multidisplinary group of clinical geneticists, fetal medicine subspecialists, two clinical scientists and a genetic bioinformaticians) reviewed anonymised variant annotation data and clinical findings using the Sapientia software (version 1.75; Congenica, Cambridge, UK). The CRP reached a consensus view as to the variant classification (i.e. pathologic, likely-pathogenic, variant of unknown significance), likely benign and benign) and the likelihood that it was causative of the fetal phenotype. Using this methodology, 0.4 variants were reviewed per proband (fetus with structural anomaly)27.
The Petrovski paper used similar but not identical methodology for identification of pathologic variants28. Again they used trio analysis and a previously published frameworkto allow rapid and efficient identification of de-novo and inherited variants38. This focused upon two ‘tiers’ of qualifying genotypes. Tier 1 were associated with the assumption that a relevant genotype would be highly penetrant and be absent from the parents (and controls). Tier 2 was a literature motivated screen, which permitted genotypes to be observed at low frequencies amongst controls (internal and external) and had to have been previously classified a pathogenic on Clinvar or Human Gene Mutation databases28,39. Again potential causative variants were classified by a multidisplinary conference of specialists to agree genotype/phenotype causation. This methods analysed all genes and also incorporated ‘bioinformatic signatures’, assessing variants in genes that were not yet linked to disease. This resulted in a ten-fold increase in the variant interpretation burden compared to PAGE (4.8 variants per case versus 0.42 per case requiring manual interpretation by PAGE), with a limited difference in the overall final pathogenic variant yield28. This demonstrates the need to balance a higher diagnostic yield versus higher interpretational burden in a prenatal ES strategy, as well as considering the bioinformatics pipeline adopted. It is also important to realise that causative variant association with phenotype may alter with time and the ‘variant’ list may be updated with time. This means that if the fetal trio ES were to be periodically reanalysed (during childhood) every 1-2 years additional pathologic variants may be identified. This has already been recognised in the use of ES/WGS in paediatric datasets40.