Substitution spectra of SARS-CoV-2 genome reveals insights into the
evolution of variants across the pandemic
Abstract
Background: Changing morbidity and mortality from COVID-19 has
been associated with the emergence of new SARS-CoV-2 variants. Whereby,
acquisition of mutations in the Spike glycoprotein enhanced host
receptor binding, cell entry and antibody escape. Understanding these
can help predict the impact of these changes. We used genome sequence
data to investigate mutation rates and entropy of SARS-CoV-2 during
pandemic surges between 2020 and 2022. Methods: 1,637
SARS-CoV-2 genomes from Pakistan were analyzed using the Augur
phylogenetic pipeline. Substitution rates and entropy of genomes were
calculated year wise and, entropy in the Spike gene was compared for
2020, 2021 and 2022 (defined as periods A, B and C). Central
Findings: In period A, G clades were predominant and SARS-CoV-2 genome
substitution rate was 6.06 x 10 -4 per site per year.
In period B, Delta variant were dominant and substitution rates
increased to 9.74 x 10 -4. In period C, Omicron
variants dominated with substitution rates at 5.02 x 10
-4. The rate of genome-wide entropy was the highest
during B particularly, in the Spike gene such as, E484K and K417N.
During C, genome-wide mutations were increased whilst entropy was
reduced. Conclusions: The highest SARS-CoV-2 genome
substitution rates in 2021 were associated with the Delta wave, which
had the greatest morbidity and mortality. These stabilized during the
Omicron wave in 2022, when COVID-19 numbers were high mortality was
lower. Assessment of SARS-CoV-2 evolution should be monitored together
with phylogeographical analysis can help predict future outbreaks and
guide public health interventions.