Other diseases
Parkinson’s disease (PD) is a common neurodegenerative disease in the
elderly, which is caused by both genetic and environmental factors. DJ-1
is a protein-coding gene whose loss of function can lead to
neurodegeneration. Some studies have proven that DJ-1 mutation is
associated with the susceptibility gene PARK7 in PD [106].
L166P is the most typical DJ-1 missense mutation, which affects the
integrity of -helix G, resulting in poor protein folding in PD
[107,108]. It has been suggested that genetic variation in the DJ-1
protein can affect the structure of the protein and therefore the
occurrence and development of PD.
Amyotrophic lateral sclerosis (ALS),
also known as motor neuron disease (MND), is the progressive muscle
weakness and atrophy caused by the injury of upper motor neurons and
lower motor neurons [109]. Some studies have shown that the mutation
of human superoxide dismutase (HSOD) is related to the familial
heritability of ALS [110]. The rs121912442 transforms Ala into Val
in the fourth position of the amino acid sequence, which leads to the
side chain shift of the protein, destroys the stability of the dimer
interface, reduces the enzyme activity of the mutant protein compared
with the wild type, and promotes the formation of HSOD-containing
aggregates that are toxic to motor neurons [111]. This suggested
that the change of macromolecular structure caused by mutation could
change the enzyme activity and lead to the aggregation of mutant
proteins and even toxic proteins [110,112].
Congenital myasthenia syndrome (CMS) is a heterogeneous neuromuscular
disease characterized by muscle weakness. Its pathogenesis is complex
and is caused by genetic factors and environmental
factors [113]. It is generally
accepted that genetic factors, especially mutations in some important
genes, play a key role in CMS [114].
Akihide
et al. found that the homozygous mutation (C.913-5T > A) in
RAPSN could inhibit the binding of U2AF65 and splicing cis-elements and
disrupt the splicing process, leading to CMS [79]. Saito et al.
identified a missense mutation c.737C > T (p.A246V) in
RAPSN as an important factor in CMS by exome sequencing [113].
Charcot-Marie-Tooth (CMT) is one of
the most common hereditary peripheral neuropathies (incidence rate is
about 1/2500), which is characterized by progressive muscle weakness and
atrophy of the distal extremities with sensory disorders [115].
Studies have found that the mutation rs137852646 caused the amino acid
variation G526R of human glycyl-tRNA synthetase (GlyRS), and the dimer
formed by G526R protein is tighter than that of the wild type, blocking
the binding of GlyRS amp active site, resulting in the loss of
aminoacylation activity [106]. Therefore, the change of dimer
interface caused by genetic variation may be an important factor of CMT,
which helps us understand the CMT mechanism [116].
Discussion
Genetic variation has a complex effect on the structure of
macromolecules, and it is of great significance to predict this effect.
We know that structure determines the function of macromolecules in
biological processes, and effective and accurate predictions would
greatly accelerate the research process of disease mechanisms under the
influence of genetic variation on macromolecules with abnormal
structures. The exploration of structures has gradually introduced
dynamic programming algorithms, partition functions, sequence alignment,
and ML from the initial MFE model based only on thermodynamic laws,
which has continuously broadened the dimension of describing structural
changes, and finally achieved the current more accurate and efficient
results. The tools combined with the corresponding methods and genetic
variation have been used to predict the structural effects due to
mutations, and the results can explain some disease mechanisms.
At present, many methods and tools are commonly used to predict the
impact of SNP on macromolecules, each with advantages and disadvantages.
The traditional RNA prediction methods are relatively mature, but the
accuracy and speed of prediction are not as good as the new methods
combined with ML. The influence of genetic variation on macromolecular
structure will be more explained under the new method. Stunning progress
has been made in protein structure prediction. AlphaFold2 developed by
deepmind has achieved high accuracy in predicting protein structure
[63]. Minkyung Baek et al. obtained the best performance by
combining with the three-track network architecture, the accuracy of
structure prediction was close to that of deepmind in CASP14 [64].
However, the combination of genetic variation information and these
methods to predict the impact of genetic variation on protein structure
is still in the initial stage [78]. As the accuracy of calculations
increases, the details of the prediction of biological macromolecules
are becoming clearer. People are trying to quantify the impact of
genetic variation on macromolecular structure [112,117]. At the same
time, the prediction objects also range from RNA with shorter sequence
structures to longer and more complex chromosome structures [118].
The prediction of the influence of genetic variation on macromolecular
structure will become increasingly accurate as computational
efficiencies increase and prediction methods diversify. It is hoped that
the application of the ML method to predict the structural effects
caused by genetic variation will bring new understanding to complex
diseases and cancers. It is believed that with the increase of the
accuracy and breadth of structure prediction, people will be able to
establish a set of personalized birth time sequence data in the future,
predict the disease risk caused by the changes of some macromolecular
structures of individuals according to the genetic variation, and give
corresponding suggestions. In addition, differences in genetic variation
among populations have potential effects on macromolecular structures,
such as the efficacy of drugs on specific targets varies from person to
person. Changes in the corresponding target can be predicted by studying
genetic variation in populations or even individual patients. It is even
possible to formulate a reasonable individualized medication plan by
predicting the structural effect of the individual mutation, to achieve
the goal of precision medical treatment.