Genetic Structure
The DAPC analysis was completed retaining the first 19 principal
components as determined by the optimum a-score, and 19 discriminant
functions which explained 30.5% of the variance (Figure 2). The
scatterplot shows a main cluster that consists of 23 of the populations
including all from the Northeastern quadrant, most of the Upper Midwest
quadrant, Stokes and Watauga Counties in North Carolina, and overlaps
with Claiborne County, Tennessee. The Southern Gulf quadrant is the only
quadrant that does not have populations in the main cluster. The
populations in the Southeastern Atlantic quadrant that are closest to
the coastline are the furthest from the main cluster with the
southernmost population from Osceola County, Florida being the most
distant.
These trends were further examined using R package LEA to assess
population structure of all individuals, outside of the pre-assigned
populations. The cross-entropy criterion determined K=5 is optimal
(Supplemental Figure 2). The five clusters can be described as assorting
into Northeast, Upper Midwest, Central, Southern, and Central Florida
(Figures 3, 4A). Moderate levels of admixture were apparent for most
populations. However, Southern populations had lower levels of admixture
with individuals mostly being assigned cluster 2, 3, or 4 and low
assignments to all other clusters (Figure 3). Every tick had membership
probabilities calculated for each of the five clusters, and a main
cluster was assigned to each individual by finding which cluster had the
highest membership probability, then the DAPC analysis redone. This
iteration showed a similar pattern to the original population based
DAPC, with one group being noticeably larger and consisting mainly of
individuals from the North (clusters 1 and 5). There is a small distant
group that is composed of the ticks from Central Florida (cluster 3),
and then a group that shares no overlap with the Northern group which is
primarily composed of ticks from the Southern Gulf (Figure 4A).
In this DAPC analysis we identified the top six alleles across four
scaffolds that were contributing to differentiating the clusters the
most (NW_024609857.1_scaffold2 position 192156357;
NW024609868.1_scaffold3 position 60000181; NW024609879.1_scaffold4
positions 31884195, 31884182, 66253319, 66253456; and NW_024610117.1
position 26212) (Supplemental Figure 4). Some of these positions fall
within annotated genes in the recent I. scapularis genome
(GenBank: GCA_016920785.2 ASM1692078v2), specifically gene loci
LOC8029585, LOC120844762, and LOC121834473. When examining the allele
representation of all clusters at these sites, cluster 3 (Central
Florida) always has the opposite allele frequency to all other clusters
(Supplemental Figure 4).
While morphological identifications confirmed the Central Florida ticks
as I. scapularis , Florida has populations of I. affinisticks that can be morphologically similar to I. scapularis . Thus,
we confirmed the species identity of the ticks collected from Central
Florida using 16S rRNA, finding that they are most similar to otherI. scapularis individuals from the Southern clade (Figure 4C).
However, to determine the alleles contributing to the differentiation of
the other four clusters, ticks within the Central Florida cluster were
removed from the dataset and the analysis rerun. In doing so, our DAPC
shows more differentiation among the remaining four clusters, especially
noting that ticks from the Upper Midwest and Northeast no longer
completely overlapped (Figure 5B). We examined the ten SNPs that
contributed the most to the differentiation of these 4 groups and found
these were all on scaffold 3 (NW024609868.1 positions 42246251,42246253,
50953023, 72034653, 72034688, 72034691, 72034789, 72034790, 72034791,
and 72034815; more information can be found in Supplemental Table 6).
There are two other groups of SNPs, comprised of loci that are
noticeably contributing to the differentiation of the four geographic
clusters, located on scaffolds 4 (NW024609879.1) and 8 (NW_024609883.1)
(Supplemental Figures 5-6).
The isolation by distance analysis was completed using the original 33
populations designated for the ticks. We do not see any strong
correlation between genetic and geographic distance of our populations
when all comparisons are included. However, Northern-to-Northern
comparisons have less genetic distance between populations even as
geographic distance increases, whereas the Southern-to-Southern
comparisons have a varied correlation of genetic to geographic distance.
The Northern-to-Southern comparisons are distinctly higher in genetic
distance across all geographical distances (Figure 5).