Approximate Bayesian computation
To determine whether populations of North American winter moth in
distinct geographic regions were the result of a single introduction to
Nova Scotia (the first introduced region recorded in North America) that
was then subsequently spread to additional locations in North America
(i.e., following a stepping-stone model), or whether each invasive
population represents a novel introduction (i.e., a serial introduction
model), or some combination of these, we compared the relatedness of
each invasive population to each other and to the “Eastern European”,
“Central European”, and “Western European” winter moth genetic
clusters previously reported in Andersen et al. (2017, In Press) using
approximate Bayesian computation (ABC), as implemented in the software
program DiyABC v.2.1.0 (Cornuet et al., 2008). For these
analyses, thirty individuals were selected randomly from each of the
three European clusters and from each of the four invasive regions.
Ideally, we would be able to perform comparisons of all possible
scenarios that include representatives from all native clusters and
invasive regions, however; the number of possible scenarios increases at
an unmanageable rate with each taxon added (e.g., there are 10,395
possible “scenarios” in a seven-taxa analysis). Therefore, we utilized
an approach similar to “tournament-ABC” presented in (Stone et al.,
2017).
As in Stone et al. (2017), we use a series of hierarchical ABC analyses
where subsets of scenarios are first compared in “tournaments” to
reduce computational complexity. Here, we first fixed the relationship
among the Eastern, Central, and Western European genetic clusters
following Andersen et al. (2017), where it was determined that the
Central European cluster was likely the result of admixture between the
Eastern and Western clusters following the post-glacial recolonization
of the European continent after the last glacial maximum. To this
topology, we also added an unsampled “ghost” population to represent a
possible extra-European origin for each invasive population. Tournament
scenarios were then built sequentially, following the documented order
of the invasion history (graphical representations of scenarios from
each tournament are presented in Supplemental Appendix Figures S1-S4).
The first tournament compared four scenarios where the Nova Scotia
population could have originated from one of the European clusters or
the extra-European “ghost” population. In the second tournament, five
scenarios were compared testing the relationship of the Oregon
population to each putative source population with the relationship of
the Nova Scotia population set based on the “best” scenario from
Tournament 1. In the third tournament, six scenarios were compared
testing the relationship of the British Columbia population to each
putative source population with the relationships of the Oregon and Nova
Scotia populations set based on the “best” scenario from Tournament 2.
Finally, in the fourth tournament, seven scenarios were compared testing
the relationship of the Northeastern United States population to each
putative source population with the relationship of the British
Columbia, Oregon, and Nova Scotia populations set based on the “best”
scenario from Tournament 3. For each tournament, a reference table of
1,000,000 generations per scenario was generated. Under each scenario we
included multiple parameters to allow for changes in population sizes,
following splitting/merging events, and included the potential for a
genetic bottleneck for each invasive population, default mutation model
parameters, the minimum mean mutation rate set to 1 x
10-5, and maximum values for the mean and individual
locus coefficient P’s were both set to 1.0. As per Andersen et al.
(2017, In Press), we removed four loci with especially large allelic
ranges (02339, 00925, 02191, and 12042) to improve the shape of the
cloud of simulated datasets. We calculated three one sample summary
statistics (mean number of alleles, mean genetic diversity, and mean
size variance) and three two sample summary statistics
(F ST, classification index, and
[dµ]2 distance). For each tournament, the scenario
representing the ancestral origin of each invasive population was
determined by comparing the results from the Logistic Regressiontest implemented in DiyABC based on comparisons of 1% of
simulated datasets closest to the observed data. Model checking for each
tournament was performed by comparing the results from a principle
components analysis (PCA) that included the supported scenario as part
of the Perform Model Checking analysis.