Pangenome construction from short-read sequences: benchmarking for population and conservation genomics

Jong Yoon Jeon; Natalie Allen; Andrew Black; Andrew DeWoody

doi:10.22541/au.173194407.74419464/v1

loading page

Pangenome construction from short-read sequences: benchmarking for population and conservation genomics

Jong Yoon Jeon,
Natalie Allen,
Andrew Black,
Andrew DeWoody

Abstract

As a collection of all the genetic variants in the gene pool, the pangenome is a concept that will become fundamental to conservation genomic studies. Unfortunately, most pangenomic approaches developed for humans and model organisms are financially impractical for conservation genomic studies of threatened or endangered species due to the high costs associated with deep sequencing multiple individuals using long read platforms. Here, by integrating metagenomic and iterative map-then-assemble approaches, we (1) propose novel workflows to construct graph pangenomes from multiple low-coverage short-read datasets; (2) benchmark these short-read pangenomes (both linear and graph) against a previously published long-read graph pangenome of the barn swallow; and (3) evaluate the utility of our workflows in population and conservation genomics. Our results indicate that economical short-read graph pangenomes can recover the vast majority of the variants identified through expensive long-read graph approaches, and that these variants accurately detect important biological signals (e.g., spatial structure and independent taxonomic delineations). These results mean that researchers can utilize their limited, conservation-oriented funding to more fully characterize all the variants in a particular gene pool for population-level analyses.

16 Nov 2024Submitted to Molecular Ecology Resources

Show details

Hide details

18 Nov 2024Submission Checks Completed

18 Nov 2024Assigned to Editor

18 Nov 2024Review(s) Completed, Editorial Evaluation Pending

10 Dec 2024Reviewer(s) Assigned

Abstract

Peer review status:UNDER REVIEW