Pleistocene persistence and expansion in tarantulas on the Colorado
Plateau and the effects of missing data on phylogeographical inferences
from RADseq
Abstract
Few phylogeographical studies exist for taxa inhabiting the Colorado
Plateau province. We combined mitochondrial and genomic data with
species distribution modeling to test Pleistocene hypotheses for
Aphonopelma marxi, a large tarantula endemic to the plateau
region. Mitochondrial and genomic analyses revealed that the species
comprises at least three main clades that diverged in the Pleistocene. A
clade distributed along the Mogollon Rim appears to have persisted in
place during the last glacial maximum, whereas the other two clades
probably colonized the central and northeastern portion of the species’
range from small refugial areas along river-carved canyons. Climate
models support this hypothesis for the Mogollon Rim, but late glacial
climate data appear too coarse to detect suitable areas in canyons.
Locations of canyon refugia could not be inferred from genomic analyses
due to missing data, encouraging us to explore the effect of missing
loci in phylogeographical inferences using RADseq. In phylogenetic
analyses, node support for major clades decreased with the addition of
samples with significant amounts of missing data (more than 30%).
Population genomic structure was greatly influenced by missing data,
with the group membership of many taxa changing as samples with missing
loci were added. Results from DAPC, a distance-based method, did not
change as samples with significant amounts missing data were added. We
conclude that the specific loci that are missing matters more than the
number of missing loci, and that samples with missing data can still add
information to RADseq-based analyses as long as results are interpreted
cautiously.