Distance analyses as a complement to ABC analyses
Historically, straightforward phylogenetic methods have been used to reconstruct the origins of invasive populations, particularly asexual organisms (e.g., Havill, Montgomery, Yu, Shiyake, & Caccone, 2006; Qin & Gullan, 1998). However, for sexual organisms where recombination and larger effective population sizes makes the results from phylogenetic inference ambiguous, a popular workflow for determining the origin of an introduced population includes the following steps: 1) to identify distinct genetic clusters either using Bayesian algorithms such as those implemented in Structure, FastStructure (Raj, Stephens, & Pritchard, 2014), and Admixture (Alexander & Lange, 2011), and/or to use measures of genetic distance (e.g., Latreille, Milesi, Magalon, Mavingui, & Atyama, 2019; Negawo et al., 2020; Rahi et al., 2020); 2) to create a random subset of equal-numbered individuals from each genetic or geographic cluster; and 3) to compare potential introduction scenarios using approximate Bayesian computation. Unfortunately, methods for the interpretation of “admixed” populations (i.e., populations with mixed probabilities of assignment) are needed as populations with assignment to multiple genetic clusters is a common result (as reviewed in Lawson et al., 2018). We believe that distance-based clustering of the population coefficients of assignment from Bayesian clustering algorithms (such as the Structure ‘popfiles’) presents a rapid and useful approach for the reconstruction of the regions of origins of nonnative populations, particularly when populations in the native range are highly admixed and/or have limited genetic diversity, as we saw with populations of winter moth. This approach is particularly attractive in that it is almost instantaneous (after clustering runs have completed that is). Additionally, the approach removes the need for the investigator to define arbitrary cutoffs for population assignments. For example, in instances when individuals have mixed probabilities of assignment based on Bayesian assignment (e.g., Q ≤ 0.75 to any one cluster when averaged across Structure runs), assigning samples or populations to distinct clusters might not be possible visually but is trivial for distance-based clustering algorithms, like the one implemented in R.