Matthew DeSaix - 21DOCS Test Area

Matthew DeSaix

Public Documents 5

gscramble: Simulation of admixed individuals without reuse of genetic material

Eric Anderson

and 3 more

August 25, 2024

While a best practice for evaluating the behavior of genetic clustering algorithms on empirical data is to conduct parallel analyses on simulated data, these types of simulation techniques often involve sampling genetic data with replacement. In this paper we demonstrate that sampling with replacement, especially with large marker sets, inflates the perceived statistical power to correctly assign individuals (or the alleles that they carry) back to source populations—a phenomenon we refer to as resampling-induced, spurious power inflation (RISPI). To address this issue, we present gscramble a simulation approach in R for creating biologically informed individual genotypes from empirical data that: 1) samples alleles from populations without replacement, 2) segregates alleles based on species-specific recombination rates. This framework makes it possible to simulate admixed individuals in a way that respects the physical linkage between markers on the same chromosome and which does not suffer from RISPI. This is achieved in gscramble by allowing users to specify pedigrees of varying complexity in order to simulate admixed genotypes, segregating and tracking haplotype blocks from different source populations through those pedigrees, and then sampling—using a variety of permutation schemes—alleles from empirical data into those haplotype blocks. We demonstrate the functionality of gscramble with both simulated and empirical data sets and highlight additional uses of the package that users may find valuable.

Broad-scale seasonal climate tracking is a consequence, not a driver, of avian migrat...

Marius Somveille

and 3 more

November 14, 2023

Tracking climatic conditions throughout the year is often assumed to be an adaptive behavior underlying seasonal migration patterns in animal populations. In this study, we investigate this hypothesis using genetic markers data to map migratory connectivity for 22 genetically distinct bird populations across 6 species. We found that the variation in seasonal climate tracking at a continental scale is more likely a consequence, rather than an underlying driver, of migratory connectivity, which is itself largely shaped by energy efficiency -- i.e. optimizing the balance between accessing available resources and the cost of movement. However, our results also suggest that regional-scale seasonal precipitation tracking affects migration destinations, thus revealing a potential scale-dependency of ecological processes driving migration. Our results have implications for the conservation of migratory species under climate change, as populations that track climate seasonally are potentially at higher risk if they adapt to a narrow range of climatic conditions.

Genomics-informed conservation units reveal spatial variation in climate vulnerabilit...

Caitlin Miller

and 13 more

August 03, 2023

Identifying genetic conservation units (CUs) in threatened species is critical for the preservation of adaptive capacity and evolutionary potential in the face of climate change. However, delineating CUs in highly mobile species remains a challenge due to high rates of gene flow and genetic signatures of isolation by distance. Even when CUs are delineated in highly mobile species, the CUs often lack key biological information about what populations have the most conservation need to guide management decisions. Here we implement a framework for rigorous CU identification in the Canada Warbler (Cardellina canadensis), a highly mobile migratory bird species of conservation concern, and then integrate demographic modeling and genomic offset within a CU framework to guide conservation decisions. We find that whole-genome structure in this highly mobile species is primarily driven by putative adaptive variation. Identification of CUs across the breeding range revealed that Canada Warblers fall into two Evolutionarily Significant Units (ESU), and three putative Adaptive Units (AUs) in the South, East and Northwest. Quantification of genomic offset within each AU reveals significant spatial variation in climate vulnerability, with the Northwestern AU being identified as the most vulnerable to future climate change based on genomic offset predictions. Alternatively, quantification of past population trends within each AU revealed the steepest population declines have occurred within the Eastern AU. Overall, we illustrate that genomics-informed CUs provide a strong foundation for identifying current and potential future regional threats that can be used to manage highly mobile species in a rapidly changing world.

Low-coverage whole-genome sequencing for highly accurate population assignment: Mappi...

Matthew DeSaix

and 14 more

June 16, 2023

Understanding the geographic linkages among populations across the annual cycle is an essential component for understanding the ecology and evolution of migratory species and for facilitating their effective conservation. While genetic markers have been widely applied to describe migratory connections, the rapid development of new sequencing methods, such as low-coverage whole genome sequencing (lcWGS), provides new opportunities for improved estimates of migratory connectivity. Here, we use lcWGS to identify fine-scale population structure in a widespread songbird, the American Redstart (Setophaga ruticilla), and accurately assign individuals to genetically distinct breeding populations. Assignment of individuals from the nonbreeding range reveals population-specific patterns of varying migratory connectivity. By combining migratory connectivity results with demographic analysis of population abundance and trends, we consider full annual cycle conservation strategies for preserving numbers of individuals and genetic diversity. Notably, we highlight the importance of the Northern Temperate-Greater Antilles migratory population as containing the largest proportion of individuals in the species. Finally, we highlight valuable considerations for other population assignment studies aimed at using lcWGS. Our results have broad implications for improving our understanding of the ecology and evolution of migratory species through conservation genomics approaches.

Population assignment from genotype likelihoods for low-coverage whole-genome sequenc...

Matthew DeSaix

and 3 more

June 02, 2023

Low-coverage whole genome sequencing (WGS) is increasingly used for the study of evolution and ecology in both model and non-model organisms; however, effective application of low-coverage WGS data requires the implementation of probabilistic frameworks to account for the uncertainties in genotype likelihood data. Here, we present a probabilistic framework for using genotype likelihood data for standard population assignment applications. Additionally, we derive the Fisher information for allele frequency from genotype likelihood data and use that to describe a novel metric, the effective sample size, which figures heavily in assignment accuracy. We make these developments available for application through WGSassign, an open-source software package that is computationally efficient for working with whole genome data. Using simulated and empirical data sets, we demonstrate the behavior of our assignment method across a range of population structures, sample sizes, and read depths. Through these results, we show that WGSassign can provide highly accurate assignment, even for samples with low average read depths (< 0.01X) and among weakly differentiated populations. Our simulation results highlight the importance of equalizing the effective sample sizes among source populations in order to achieve accurate population assignment with low-coverage WGS data. We further provide study design recommendations for population-assignment studies and discuss the broad utility of effective sample size for studies using low-coverage WGS data.