Rebecca Taylor

and 2 more

Runs of homozygosity (ROH) are increasingly being analyzed using whole genome sequences in non-model species as a measure of inbreeding and to assess demographic history, thus providing useful information for conservation. However, most studies have used Plink for ROH inference which has been shown to perform poorly when sequencing depth is below 10X, often underestimating the true proportion of the genome in ROH, which could lead to erroneous status assessment and management decisions. We use whole genome sequences from caribou, a non-model species at risk, subsampled to sequencing depths ranging from 1X to 15X, to assess the performance of ROHan, a program developed to enable ROH estimation using lower coverage sequences but so far only optimized for human data. We use 22 individuals with varying extent of inbreeding to assess the effects of sequencing depth, input parameters, and demographic history on the inference of ROH. We found that accurate estimation of the percentage of the genome and lengths of ROH can be achieved down to depths as low as 3-5X. However, input parameters and the demographic history of the individual can have a dramatic effect on results. Using our optimized settings, we then re-analyze low coverage sequences from a small and isolated caribou population and demonstrate high levels of inbreeding which had previously been missed. We provide recommendations for thorough optimization of parameters including the need for multiple runs as well as careful interpretation of outputs to enable robust ROH inference using low coverage whole genome sequences in wildlife species.

Peng Liu

and 4 more

Accurate and efficient genotyping of microsatellite loci is essential for their application in population genetics and various demographic analysis. Protocols for next generation sequencing of microsatellite loci generate high-throughput and cross-compatible allele scoring characteristics: common issues associated with size separation on conventional capillary-based protocols. As a result, we have developed a novel, ultra-fast, all-in-one software Seq2Sat in C++ to support accurate automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and subsequently performs read quality control before inferring genotypes based on depth of read, sequence composition and length. It does not produce any intermediate files, making I/O very efficient. Additionally, we developed a module in Seq2Sat for sex identification based on sex locus amplicons. We further developed a user-friendly website-based platform SatAnalyzer to conduct reads-to-report analyses by calling Seq2Sat to generate genotype tables and interactive genotype graphs for manual editing. SatAnalyzer also allows visualization of read quality and distribution across loci and samples to troubleshoot multiplex optimization and high-quality library preparation. To evaluate its performance, we benchmarked SatAnalyzer against conventional capillary gel electrophoresis and an existing microsatellite genotyping software MEGASAT. Results show that SatAnalyzer can achieve > 0.993 genotyping accuracy and Seq2Sat is ~ 5 times faster than MEGASAT despite many more informative tables and figures generated. Seq2Sat and SatAnalyzer are freely available at github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).

Paul Wilson

and 1 more

The evolutionary origins and hybridization patterns of Canis species in North America has been hotly debated for the past 30 years. Disentangling ancestry and timing of hybridization in Great Lakes wolves, eastern Canadian wolves, red wolves, and eastern coyotes is most often partitioned into a 2-species model that assigns all ancestry to grey wolves and/or coyotes, and a 3-species model that includes a third, North American evolved eastern wolf genome. The proposed models address recent or sometimes late Holocene hybridization events but have largely ignored Pleistocene era opportunities for hybridization that may have impacted the current mixed genomes in eastern Canada and the United States. Here, we re-analyze contemporary and ancient mitochondrial DNA genomes with Bayesian phylogenetic analyses to more accurately estimate divergence dates among lineages. We combine that with a review of the literature on Late Pleistocene Canis distributions to illuminate opportunities for ancient hybridization events between extinct Beringian grey wolves (C. lupus) and extinct large wolf-like coyotes (C. latrans orcutti) that we propose as a potentially unrecognized source of introgressed genomic variation within contemporary Canis genomes. These events speak to the potential origins of contemporary genomes and provide a new perspective on Canis ancestry, but do not influence/negate current conservation priorities of dwindling wolf populations with unique genomic signatures and key ecologically critical roles.

Samantha McFarlane

and 2 more

1. In many social species, reproductive success varies between individuals within a population, resulting in socially structured populations. Social network analyses of familial relationships may provide insights on how fitness influences population-level demographic patterns. These methods have however rarely been applied to genetically-derived pedigree data from wild populations. 2. Here we use social networks to reconstruct parent-offspring relationships and create a familial network from polygamous boreal woodland caribou (Rangifer tarandus caribou) in Saskatchewan, Canada, to inform recovery efforts. We collected samples from 933 individuals at 15 variable microsatellite loci along with caribou-specific primers for sex identification. Using social network metrics, we assess the contribution of individual caribou to the population with several centrality metrics and then determine which metrics are best suited to inform on the population demographic structure. We look at the centrality of individuals from eighteen different local areas, along with the entire population. 3. We found substantial differences in centrality of individuals in different local areas, that in turn contributed differently to the full network, highlighting the importance of analyzing social networks at different scales. The full network revealed that boreal caribou in Saskatchewan form a complex, interconnected social network with strong familial ties, as the removal of edges with high betweenness did not result in distinct subgroups. Alpha, betweenness, and eccentricity centrality were the most informative metrics to characterize the population demographic structure and for spatially identifying areas of highest fitness levels and social cohesion across the range. 4. Synthesis and applications: Our results demonstrate the value of different network metrics in assessing genetically-derived familial networks. The spatial application of the familial networks identified areas of higher fitness levels and social cohesion across the range in support of population monitoring and recovery efforts.

Melanie Prentice

and 4 more

Clock genes exhibit substantial control over gene expression and ultimately life-histories using external cues such as photoperiod, and are thus likely to be critical for adaptation to shifting seasonal conditions and novel environments as species redistribute their ranges under climate change. Coding trinucleotide repeats (cTNRs) are found within several clock genes, and may be interesting targets of selection due to their containment within exonic regions and elevated mutation rates. Here, we conduct inter-specific characterization of the NR1D1 cTNR between Canada lynx and bobcat, and intra-specific spatial and environmental association analyses of neutral microsatellites and our functional cTNR marker, to investigate the role of selection on this locus in Canada lynx. We report signatures of divergent selection between lynx and bobcat, with the potential for hybrid-mediated gene flow in the area of range overlap. We also provide evidence that this locus is under selection across Canada lynx in eastern Canada, with both spatial and environmental variables significantly contributing to the explained variation, after controlling for neutral population structure. These results suggest that cTNRs may play an important role in the generation of functional diversity within some mammal species, and allow for contemporary rates of adaptation in wild populations in response to environmental change. We encourage continued investment into the study of cTNR markers to better understand their broader relevance to the evolution and adaptation of mammals.

Samantha McFarlane

and 6 more

Accurately estimating abundance is a critical component of monitoring and recovery of rare and elusive species. Spatial capture-recapture (SCR) models are an increasingly popular method for robust estimation of ecological parameters. We provide a maximum likelihood analytical framework to assess results from empirical studies to inform SCR sampling design, using both simulated and empirical data from non-invasive genetic sampling of seven boreal caribou populations (Rangifer tarandus caribou) which varied in range size and estimated population density. We use simulated population data with varying levels of clustered distributions to quantify the impact of non-independence of detections on density estimates, and empirical datasets to explore the influence of varied sampling intensity on the relative bias and precision of density estimates. Simulations revealed that clustered distributions of detections did not significantly impact relative bias or precision of density estimates. The empirical genotyping success rate was 95.1%. Empirical results indicated that reduced sampling intensity had a greater impact on density estimates in smaller ranges. The number of captures and spatial recaptures were strongly correlated with precision, but not relative bias. The best sampling designs did not differ with estimated population density but differed between large and small ranges. We provide an efficient framework implemented in R to estimate the detection parameters required when designing SCR studies. The framework can be used when designing a monitoring program to minimize effort and cost while maximizing effectiveness, which is critical for informing wildlife management and conservation.

Samantha McFarlane

and 6 more

Accurately estimating abundance is a critical component of monitoring and recovery of rare and elusive species. Spatial capture-recapture (SCR) models are an increasingly popular method for robust estimation of ecological parameters. We provide a maximum likelihood analytical framework to assess results from empirical studies to inform SCR sampling design, using both simulated and empirical data from non-invasive genetic sampling of seven boreal caribou populations (Rangifer tarandus caribou) which varied in range size and estimated population density. We use simulated population data with varying levels of clustered distributions to quantify the impact of non-independence of detections on density estimates, and empirical datasets to explore the influence of varied sampling intensity on the relative bias and precision of density estimates. Simulations revealed that clustered distributions of detections did not significantly impact relative bias or precision of density estimates. The empirical genotyping success rate was 95.1%. Empirical results indicated that reduced sampling intensity had a greater impact on density estimates in smaller ranges. The number of captures and spatial recaptures were strongly correlated with precision, but not relative bias. The best sampling designs did not differ with estimated population density but differed between large and small ranges. We provide an efficient framework implemented in R to estimate the detection parameters required when designing SCR studies. The framework can be used when designing a monitoring program to minimize effort and cost while maximizing effectiveness, which is critical for informing wildlife management and conservation.

Rebecca Taylor

and 5 more

Parallel evolution can occur through novel mutations, standing genetic variation, or adaptive introgression. Uncovering parallelism and introgressed populations can complicate management of threatened species, particularly as admixed populations are not generally considered under conservation legislations. We examined high coverage whole-genome sequences of 30 caribou (Rangifer tarandus) from across North America and Greenland, representing divergent intra-specific lineages, to investigate parallelism and levels of introgression contributing to the formation of ecotypes. Caribou are split into four subspecies and 11 extant conservation units, known as Designatable Units (DUs), in Canada. Using genomes from all four subspecies and six DUs, we undertake demographic reconstruction and confirm two previously inferred instances of parallel evolution in the woodland subspecies and uncover an additional instance of parallelism of the eastern migratory ecotype. Detailed investigations reveal introgression in the woodland subspecies, with introgressed regions found spread throughout the genomes encompassing both neutral and functional sites. Our comprehensive investigations using whole genomes highlight the difficulties in unequivocally demonstrating parallelism through adaptive introgression in non-model species with complex demographic histories, with standing variation and introgression both potentially involved. Additionally, the impact of parallelism and introgression on the designation of conservation units has not been widely considered, and the caribou designations will need amending in light of our results. Uncovering and decoupling parallelism and differential patterns of introgression will become prevalent with the availability of comprehensive genomic data from non-model species, and we highlight the need to incorporate this into conservation unit designations.

Rebecca Taylor

and 3 more

Conservation genomics is an important tool to manage threatened species under current biodiversity loss. Recent advances in sequencing technology mean that we can now use whole genomes to investigate demographic history, local adaptation, inbreeding, and more in unprecedented detail. However, for many rare and elusive species only non-invasive samples such as faeces can be obtained, making it difficult to take advantage of whole genome data. We present a method to extract DNA from the mucosal layer of faecal samples to reconstruct high coverage whole genomes using standard laboratory techniques, therefore in a cost-effective and efficient way. We use wild collected faecal pellets collected from wild caribou (Rangifer tarandus), a species undergoing declines in many parts of its range in Canada and subject to comprehensive conservation and population monitoring measures. We compare four faecal genomes to two tissue genomes sequenced in the same run. Quality metrics were similar between faecal and tissue samples with the main difference being the alignment success of raw reads to the reference genome likely due to differences in endogenous DNA content, affecting overall coverage. One of our faecal genomes was only reconstructed at low coverage (1.6X), however the other three obtained between 7 and 15X, compared to 19 and 25X for the tissue samples. We successfully reconstructed high-quality whole genomes from faecal DNA and, to our knowledge, are the first to obtain genome-wide data from wildlife faecal DNA in a non-primate species, representing an important advancement for non-invasive conservation genomics.