Other diseases
Parkinson’s disease (PD) is a common neurodegenerative disease in the elderly, which is caused by both genetic and environmental factors. DJ-1 is a protein-coding gene whose loss of function can lead to neurodegeneration. Some studies have proven that DJ-1 mutation is associated with the susceptibility gene PARK7 in PD [106]. L166P is the most typical DJ-1 missense mutation, which affects the integrity of -helix G, resulting in poor protein folding in PD [107,108]. It has been suggested that genetic variation in the DJ-1 protein can affect the structure of the protein and therefore the occurrence and development of PD.
Amyotrophic lateral sclerosis (ALS), also known as motor neuron disease (MND), is the progressive muscle weakness and atrophy caused by the injury of upper motor neurons and lower motor neurons [109]. Some studies have shown that the mutation of human superoxide dismutase (HSOD) is related to the familial heritability of ALS [110]. The rs121912442 transforms Ala into Val in the fourth position of the amino acid sequence, which leads to the side chain shift of the protein, destroys the stability of the dimer interface, reduces the enzyme activity of the mutant protein compared with the wild type, and promotes the formation of HSOD-containing aggregates that are toxic to motor neurons [111]. This suggested that the change of macromolecular structure caused by mutation could change the enzyme activity and lead to the aggregation of mutant proteins and even toxic proteins [110,112].
Congenital myasthenia syndrome (CMS) is a heterogeneous neuromuscular disease characterized by muscle weakness. Its pathogenesis is complex and is caused by genetic factors and environmental factors [113]. It is generally accepted that genetic factors, especially mutations in some important genes, play a key role in CMS [114]. Akihide et al. found that the homozygous mutation (C.913-5T > A) in RAPSN could inhibit the binding of U2AF65 and splicing cis-elements and disrupt the splicing process, leading to CMS [79]. Saito et al. identified a missense mutation c.737C > T (p.A246V) in RAPSN as an important factor in CMS by exome sequencing [113].
Charcot-Marie-Tooth (CMT) is one of the most common hereditary peripheral neuropathies (incidence rate is about 1/2500), which is characterized by progressive muscle weakness and atrophy of the distal extremities with sensory disorders [115]. Studies have found that the mutation rs137852646 caused the amino acid variation G526R of human glycyl-tRNA synthetase (GlyRS), and the dimer formed by G526R protein is tighter than that of the wild type, blocking the binding of GlyRS amp active site, resulting in the loss of aminoacylation activity [106]. Therefore, the change of dimer interface caused by genetic variation may be an important factor of CMT, which helps us understand the CMT mechanism [116].

Discussion

Genetic variation has a complex effect on the structure of macromolecules, and it is of great significance to predict this effect. We know that structure determines the function of macromolecules in biological processes, and effective and accurate predictions would greatly accelerate the research process of disease mechanisms under the influence of genetic variation on macromolecules with abnormal structures. The exploration of structures has gradually introduced dynamic programming algorithms, partition functions, sequence alignment, and ML from the initial MFE model based only on thermodynamic laws, which has continuously broadened the dimension of describing structural changes, and finally achieved the current more accurate and efficient results. The tools combined with the corresponding methods and genetic variation have been used to predict the structural effects due to mutations, and the results can explain some disease mechanisms.
At present, many methods and tools are commonly used to predict the impact of SNP on macromolecules, each with advantages and disadvantages. The traditional RNA prediction methods are relatively mature, but the accuracy and speed of prediction are not as good as the new methods combined with ML. The influence of genetic variation on macromolecular structure will be more explained under the new method. Stunning progress has been made in protein structure prediction. AlphaFold2 developed by deepmind has achieved high accuracy in predicting protein structure [63]. Minkyung Baek et al. obtained the best performance by combining with the three-track network architecture, the accuracy of structure prediction was close to that of deepmind in CASP14 [64]. However, the combination of genetic variation information and these methods to predict the impact of genetic variation on protein structure is still in the initial stage [78]. As the accuracy of calculations increases, the details of the prediction of biological macromolecules are becoming clearer. People are trying to quantify the impact of genetic variation on macromolecular structure [112,117]. At the same time, the prediction objects also range from RNA with shorter sequence structures to longer and more complex chromosome structures [118].
The prediction of the influence of genetic variation on macromolecular structure will become increasingly accurate as computational efficiencies increase and prediction methods diversify. It is hoped that the application of the ML method to predict the structural effects caused by genetic variation will bring new understanding to complex diseases and cancers. It is believed that with the increase of the accuracy and breadth of structure prediction, people will be able to establish a set of personalized birth time sequence data in the future, predict the disease risk caused by the changes of some macromolecular structures of individuals according to the genetic variation, and give corresponding suggestions. In addition, differences in genetic variation among populations have potential effects on macromolecular structures, such as the efficacy of drugs on specific targets varies from person to person. Changes in the corresponding target can be predicted by studying genetic variation in populations or even individual patients. It is even possible to formulate a reasonable individualized medication plan by predicting the structural effect of the individual mutation, to achieve the goal of precision medical treatment.