The Globally search for a Regular Expression and Print matching lines
(GREP) strategy: an innovative reanalysis strategy combining
bibliographic monitoring with fast GREP directly applied to a massive
genomic database to rapidly improve diagnosis
Abstract
Purpose: Exome sequencing has a diagnostic yield ranging from
25% to 70% in rare diseases and regularly implicates genes in novel
disorders. Prospective data reanalysis has demonstrated strong efficacy
in improving diagnosis, but poses organizational difficulties for
clinical laboratories. We applied a reanalysis strategy based on
intensive prospective bibliographic monitoring, and directly applied the
Globally search for a Regular Expression and Print matching lines (GREP)
command-line to a massive ES database. Methods: For 18 months,
we submitted daily the same 5 keywords of interest (( intellectual
disability, ( neuro)developmental delay, (neuro)developmental
disorder)) to PubMed, to identify recently published, novel
disease-gene associations, or new phenotypes in genes already implicated
in human pathology. We used the Linux GREP command-line and an in-house
script, to collect all variants in these genes from our 5459 exome
database. Results: We grepped 128 genes and collected 56
candidate variants in 53 individuals. We confirmed causal diagnosis for
19/128 genes (15%) in 21 individuals, and identified variants of
unknown significance for 19/128 genes (15%) in 23 individuals.
Altogether, we confirmed pathogenicity in 21/2875 undiagnosed affected
probands (0.7%). Conclusion: The GREP command-line is
efficient, and less tedious than complete periodical reanalysis. It is
an interesting reanalysis strategy to improve diagnosis.