Other identification
tools
IntSplice(http://www.med.nagoya-u.ac.jp/neurogenetics/IntSplice)
is the web server to identify SNVs affecting intronic cis-elements. The
precise spatiotemporal regulation of splicing is mediated by the
splicing cis-elements on pre-mRNA. IntSplice uses an online SVM model
based on the analysis of the effect size of each intron nucleotide on
the annotated alternative splicing. It can predict the splicing
consequences of SNVs at intron positions -50 to -3 in the human genome.
The IntSplice model was applied to distinguish pathogenic SNVs from the
Human Gene Mutation Database and normal SNVs from the dbSNP database and
achieved good results [79].
Natural Language Processing based non-synonymous Single Nucleotide
Polymorphism Predictor (NLP-SNPPred,http://www.nlp-snppred.cbrlab.org.) web server could distinguish
pathogenic protein-coding variations and neutral protein-coding
variations based on the state-of-the-art Natural language Processing
(NLP) of Artificial Intelligence (AI). Through feature extraction, a
multi-class classifier (CLF1) followed by a binary-class classifier
(CLF2) is created. NLP-SNPPred uses the NPL approach to read biological
literature for the identification of pathogenic versus neutral variants
and outperforms state-of-the-art functional prediction methods and can
be used to predict functional effects of protein-coding mutations. NLP
will add more features such as homology, epigenomics, and evolutionary
information [80].
The impact of genetic variation on macromolecular structure
in
diseases
Although GWAS has identified some
genetic variants, the mechanisms between genetic variation and disease
remain unclear. Genetic variations can alter the structure of RNA or
proteins, leading to abnormalities in biological function or even
disease. Some studies found that the molecular mechanism of some typical
diseases was closely related to the structural effects of genetic
variation, such as hypertension [81,82], Retinoblastoma [83,84],
β Thalassemia [85-90], and so on.
Hereditary
hyperferritinemia-cataract syndrome
Hereditary hyperferritinemia-cataract syndrome (HHCS) is a rare disease
characterized by high serum ferritin levels and
congenital bilateral cataracts. U22G
and U22G - G14C are two SNPs in the 5’- UTR of ferritin light (FTL)
mRNA. FTL chain is an iron-responsive element (IRE) in 5’- UTR, which
plays a major regulatory role in mRNA translation. Some studies found
that [91] these two SNPs can affect RNA structure and subsequent
gene function. SNP U22G can disrupt the structure of the IRE, leading to
abnormal FTL gene regulation. However, U22G - G14C can restore the
mutated ire to wild type [92].
Rs886037623 (T22G) changes the
spatial structure of mRNA folding because the original U was replaced by
G in mRNA [93]. Under normal circumstances, in a low iron
environment, iron regulatory proteins (IRP) will combine with IRE in
correctly folded mRNA to form a repressor complex of protein synthesis,
and the synthesis of ferritin is inhibited. After mutation, the
structurally altered mRNA can no longer bind to IRP, the transcriptional
regulation is lost, and a large amount of ferritin is secreted,
resulting in the formation of hyperferritinemia. At the same time, too
much ferritin precipitates in the lens, resulting in cataracts
[94]. The effects of genetic
variation on HHCS are shown in Figure 5A.
Sicklemia
Sicklemia is the most serious of
the abnormal hemoglobinopathy, whose clinical manifestations are chronic
hemolytic anemia, susceptibility to infection, and chronic ischemia
leading to organ and tissue damage [95]. Its pathogenesis is complex
and significantly related to genetic factors. There is evidence that the
T is replaced by A, in rs334 on the gene encoding hemoglobin, after the
transcription and translation process, glutamic acid is replaced by
valine to form abnormal hemoglobin at the sixth position in the β-chain
amino acid sequence [96]. When the oxygen partial pressure
decreases, hemoglobin molecules interact with each other to form a
spiral polymer, which distorts red blood cells into sickle cells, and
finally leads to anemia [97].
The effects of genetic variation on
Sicklemia are shown in Figure 5B.
The effects of genetic variation on
HHCS are shown in Figure 5A. Because of the mutation rs886037623, T
-> G, and the corresponding U in the transcribed mRNA is
replaced by G, leading to its structural change, IRP cannot bind to it,
and finally resulting in the overexpression of ferritin. The effects of
genetic variation on Sicklemia are
shown in Figure 5B. Because of the mutation rs334, T > A,
and through the subsequent transcription and translation process, Glu
(E) at position 6 of the amino acid sequence of the protein becomes Val
(V). Change the structure of hemoglobin molecules, and finally lead to
Sicklemia.
Tumor and cancer
In cancer research, the influence of genetic variation on the
macromolecular structure cannot be ignored. For example, Retinoblastoma
(RB) is a malignant tumor caused by photoreceptor precursor cells, which
is common in children under 3 years old and has family genetic
susceptibility [98]. It is proven that some mutations are closely
related to RB [84]. J K Cowell et al. [83] identified a novel
mutation(G→C) within a core motif of specificity protein 1 (SP1)
transcription factor from a family with a mild RB and a band shift of an
unidentified protein was found in the mutant oligomer. This protein may
affect the expression of the RB1 gene and eventually lead to RB.
In addition, the influence of genetic variation on lncRNA has been
extensively explored in some cancers. LINC00673 is a potential tumor
suppressor of pancreatic cancer. Rs11655237 is an SNP in the exon of
LINC00673, which causes LINC00673 to have a new binding target, thus
weakening its role and increasing the risk of pancreatic cancer
[99]. A713G and T714C mutations in lncRNA GAS8-AS1 accelerate the
growth of cancer cells and increase the risk of thyroid cancer
[100]. Abnormal copy number and expression of somatic cells on
focally amplified lncRNA on chromosome 1 (FAL1) can inhibit P21and lead to ovarian cancer [11].