Distance analyses as a complement to ABC analyses
Historically, straightforward phylogenetic methods have been used to
reconstruct the origins of invasive populations, particularly asexual
organisms (e.g., Havill, Montgomery, Yu, Shiyake, & Caccone, 2006; Qin
& Gullan, 1998). However, for sexual organisms where recombination and
larger effective population sizes makes the results from phylogenetic
inference ambiguous, a popular workflow for determining the origin of an
introduced population includes the following steps: 1) to identify
distinct genetic clusters either using Bayesian algorithms such as those
implemented in Structure, FastStructure (Raj,
Stephens, & Pritchard, 2014), and Admixture (Alexander &
Lange, 2011), and/or to use measures of genetic distance (e.g.,
Latreille, Milesi, Magalon, Mavingui, & Atyama, 2019; Negawo et al.,
2020; Rahi et al., 2020); 2) to create a random subset of equal-numbered
individuals from each genetic or geographic cluster; and 3) to compare
potential introduction scenarios using approximate Bayesian computation.
Unfortunately, methods for the interpretation of “admixed” populations
(i.e., populations with mixed probabilities of assignment) are needed as
populations with assignment to multiple genetic clusters is a common
result (as reviewed in Lawson et al., 2018). We believe that
distance-based clustering of the population coefficients of assignment
from Bayesian clustering algorithms (such as the Structure
‘popfiles’) presents a rapid and useful approach for the reconstruction
of the regions of origins of nonnative populations, particularly when
populations in the native range are highly admixed and/or have limited
genetic diversity, as we saw with populations of winter moth. This
approach is particularly attractive in that it is almost instantaneous
(after clustering runs have completed that is). Additionally, the
approach removes the need for the investigator to define arbitrary
cutoffs for population assignments. For example, in instances when
individuals have mixed probabilities of assignment based on Bayesian
assignment (e.g., Q ≤ 0.75 to any one cluster when averaged
across Structure runs), assigning samples or populations to
distinct clusters might not be possible visually but is trivial for
distance-based clustering algorithms, like the one implemented in R.