An Efficient Pipeline for Ancient DNA Mapping and Recovery of Endogenous
Ancient DNA from Whole-Genome Sequencing Data
Abstract
Ancient DNA research has developed rapidly over the past few decades due
to the improvement in PCR and next-generation sequencing (NGS)
technologies, but challenges still exist. One major challenge in
relation to ancient DNA research is to recover genuine endogenous
ancient DNA sequences from the raw sequencing data. This is often
difficult due to the degradation of ancient DNA and high levels of
contamination, especially homologous contamination. In this study, we
collected whole genome sequencing (WGS) data from 6 ancient samples to
compare different mapping algorithms. To further explore more effective
methods to separate endogenous DNA from the homologous contaminations,
we attempted to recover reads based on the ancient DNA specific
characteristics of deamination, depurination, and DNA fragmentation with
different parameters. We propose a quick and improved pipeline for
separating endogenous ancient DNA while simultaneously decreasing the
homologous contaminations to a very low proportion. Overall, these
recommendations for ancient DNA mapping and separation of endogenous DNA
in this study could facilitate future studies of ancient DNA.