A pangenome is the aggregate characterization of genomic diversity
present in a group of interest, including species and populations (e.g.,
diversity between strains of tomato, Alonge et al., 2020). Pangenomes
offer a straightforward solution to address the challenges associated
with SV discovery and genotyping with short-read sequence data. Although
originally developed to characterize variation in bacteria
(Tettelin et al.,
2005), they are commonly used in studies of complex trait diversity in
humans (Pang et al.,
2010) and agriculturally significant species
(e.g., cattle, goats,
soybean and maize; Bickhart et al., 2020; Golicz, Batley, & Edwards,
2016; H. Li, Feng, & Chu, 2020; Y. Liu et al., 2020; Low et al., 2020;
McHale et al., 2012; Yang et al., 2019). There are two components of a
pangenome: the ‘core’ genomic regions that do not vary among
individuals, and ‘accessory’ genomic regions that vary among individuals
(Bayer, Golicz,
Scheben, Batley, & Edwards, 2020; Golicz, Batley, & Edwards, 2016;
Hurgobin & Edwards, 2017; Figure 4). In a pangenomic approach, genomes
of multiple individuals are assembled de novo using multiple
platforms (e.g.,
long-reads, Hi-C, Optical mapping; Soto et al., 2020; Weissensteiner et
al., 2020), followed by pairwise comparisons of whole-genome alignments
for SNP and SV discovery
(e.g., Cortex, MUMmer,
Minimap2; Delcher et al., 1999; Iqbal, Caccamo, Turner, Flicek, &
McVean, 2012; H. Li, 2018). Once variant discovery is complete, genome
graphs representing the variation in the pangenome may be constructed to
efficiently represent ‘core’ and ‘accessory’ regions
(Eizenga et al., 2020;
H. Li, 2018; Tettelin et al., 2005). Genome graphs are a powerful
method for population-level genotyping and consistently outperform
alignment-based genotyping
(e.g., Ebler et al.,
2020 preprint; Eggertsson, 2017; Iqbal et al., 2012; D. Kim, Paggi,
Park, Bennett, & Salzberg, 2019; H. Li, 2018). As a result, pangenomic
approaches can capture complex variants including SVs across the genome,
and offer an exciting opportunity to explore the genomic basis of
complex traits (e.g.,
Gao et al.,
2019).