Swatting flies: Biting insects as non-invasive samplers for mammalian
population genomics
Abstract
Advances in next-generation sequencing have allowed the use of DNA
obtained from unusual sources for wildlife studies. However, these
samples have been used predominantly to sequence mitochondrial DNA for
species identification while population genetics analyses have been
rare. Since next-generation sequencing allows indiscriminate detection
of all DNA fragments in a sample, technically it should be possible to
sequence whole genomes of animals from environmental samples. Here we
used a blood-feeding insect, tsetse fly, to target whole genome
sequences of wild animals. Using pools of flies, we compared the ability
to recover genomic data from hosts using the short-read sequencing
(Illumina) and adaptive sampling of long-read data generated using
Oxford nanopore technology (ONT). We found that most of the short-read
data (85-99%) was dominated by tsetse fly DNA and that adaptive
sampling on the ONT platform did not substantially reduce this
proportion. However, once tsetse reads were removed, the remaining data
for both platforms tended to belong to the dominant host expected in the
tsetse fly blood meal. Reads mapping to elephants, warthogs and giraffes
were recovered more reliably than for buffalo, and there was high
variance in the contribution of DNA by individual flies to the pools,
suggesting that there are host specific biases. We were able to identify
over 300,000 SNPs for elephants, which we used to estimate the allele
frequencies and expected heterozygosity for the population. Overall, our
results show that at least for certain wild mammals, it is possible to
recover genome-wide host data from blood-feeding insects.