Approximate Bayesian computation
To determine whether populations of North American winter moth in distinct geographic regions were the result of a single introduction to Nova Scotia (the first introduced region recorded in North America) that was then subsequently spread to additional locations in North America (i.e., following a stepping-stone model), or whether each invasive population represents a novel introduction (i.e., a serial introduction model), or some combination of these, we compared the relatedness of each invasive population to each other and to the “Eastern European”, “Central European”, and “Western European” winter moth genetic clusters previously reported in Andersen et al. (2017, In Press) using approximate Bayesian computation (ABC), as implemented in the software program DiyABC v.2.1.0 (Cornuet et al., 2008). For these analyses, thirty individuals were selected randomly from each of the three European clusters and from each of the four invasive regions. Ideally, we would be able to perform comparisons of all possible scenarios that include representatives from all native clusters and invasive regions, however; the number of possible scenarios increases at an unmanageable rate with each taxon added (e.g., there are 10,395 possible “scenarios” in a seven-taxa analysis). Therefore, we utilized an approach similar to “tournament-ABC” presented in (Stone et al., 2017).
As in Stone et al. (2017), we use a series of hierarchical ABC analyses where subsets of scenarios are first compared in “tournaments” to reduce computational complexity. Here, we first fixed the relationship among the Eastern, Central, and Western European genetic clusters following Andersen et al. (2017), where it was determined that the Central European cluster was likely the result of admixture between the Eastern and Western clusters following the post-glacial recolonization of the European continent after the last glacial maximum. To this topology, we also added an unsampled “ghost” population to represent a possible extra-European origin for each invasive population. Tournament scenarios were then built sequentially, following the documented order of the invasion history (graphical representations of scenarios from each tournament are presented in Supplemental Appendix Figures S1-S4). The first tournament compared four scenarios where the Nova Scotia population could have originated from one of the European clusters or the extra-European “ghost” population. In the second tournament, five scenarios were compared testing the relationship of the Oregon population to each putative source population with the relationship of the Nova Scotia population set based on the “best” scenario from Tournament 1. In the third tournament, six scenarios were compared testing the relationship of the British Columbia population to each putative source population with the relationships of the Oregon and Nova Scotia populations set based on the “best” scenario from Tournament 2. Finally, in the fourth tournament, seven scenarios were compared testing the relationship of the Northeastern United States population to each putative source population with the relationship of the British Columbia, Oregon, and Nova Scotia populations set based on the “best” scenario from Tournament 3. For each tournament, a reference table of 1,000,000 generations per scenario was generated. Under each scenario we included multiple parameters to allow for changes in population sizes, following splitting/merging events, and included the potential for a genetic bottleneck for each invasive population, default mutation model parameters, the minimum mean mutation rate set to 1 x 10-5, and maximum values for the mean and individual locus coefficient P’s were both set to 1.0. As per Andersen et al. (2017, In Press), we removed four loci with especially large allelic ranges (02339, 00925, 02191, and 12042) to improve the shape of the cloud of simulated datasets. We calculated three one sample summary statistics (mean number of alleles, mean genetic diversity, and mean size variance) and three two sample summary statistics (F ST, classification index, and [dµ]2 distance). For each tournament, the scenario representing the ancestral origin of each invasive population was determined by comparing the results from the Logistic Regressiontest implemented in DiyABC based on comparisons of 1% of simulated datasets closest to the observed data. Model checking for each tournament was performed by comparing the results from a principle components analysis (PCA) that included the supported scenario as part of the Perform Model Checking analysis.