Rebecca Taylor

and 2 more

Runs of homozygosity (ROH) are increasingly being analyzed using whole genome sequences in non-model species as a measure of inbreeding and to assess demographic history, thus providing useful information for conservation. However, most studies have used Plink for ROH inference which has been shown to perform poorly when sequencing depth is below 10X, often underestimating the true proportion of the genome in ROH, which could lead to erroneous status assessment and management decisions. We use whole genome sequences from caribou, a non-model species at risk, subsampled to sequencing depths ranging from 1X to 15X, to assess the performance of ROHan, a program developed to enable ROH estimation using lower coverage sequences but so far only optimized for human data. We use 22 individuals with varying extent of inbreeding to assess the effects of sequencing depth, input parameters, and demographic history on the inference of ROH. We found that accurate estimation of the percentage of the genome and lengths of ROH can be achieved down to depths as low as 3-5X. However, input parameters and the demographic history of the individual can have a dramatic effect on results. Using our optimized settings, we then re-analyze low coverage sequences from a small and isolated caribou population and demonstrate high levels of inbreeding which had previously been missed. We provide recommendations for thorough optimization of parameters including the need for multiple runs as well as careful interpretation of outputs to enable robust ROH inference using low coverage whole genome sequences in wildlife species.

Peng Liu

and 4 more

Accurate and efficient genotyping of microsatellite loci is essential for their application in population genetics and various demographic analysis. Protocols for next generation sequencing of microsatellite loci generate high-throughput and cross-compatible allele scoring characteristics: common issues associated with size separation on conventional capillary-based protocols. As a result, we have developed a novel, ultra-fast, all-in-one software Seq2Sat in C++ to support accurate automated microsatellite genotyping. It directly takes raw reads of microsatellite amplicons and subsequently performs read quality control before inferring genotypes based on depth of read, sequence composition and length. It does not produce any intermediate files, making I/O very efficient. Additionally, we developed a module in Seq2Sat for sex identification based on sex locus amplicons. We further developed a user-friendly website-based platform SatAnalyzer to conduct reads-to-report analyses by calling Seq2Sat to generate genotype tables and interactive genotype graphs for manual editing. SatAnalyzer also allows visualization of read quality and distribution across loci and samples to troubleshoot multiplex optimization and high-quality library preparation. To evaluate its performance, we benchmarked SatAnalyzer against conventional capillary gel electrophoresis and an existing microsatellite genotyping software MEGASAT. Results show that SatAnalyzer can achieve > 0.993 genotyping accuracy and Seq2Sat is ~ 5 times faster than MEGASAT despite many more informative tables and figures generated. Seq2Sat and SatAnalyzer are freely available at github (https://github.com/ecogenomicscanada/Seq2Sat) and dockerhub (https://hub.docker.com/r/rocpengliu/satanalyzer).