Genetic diversity and population structure
We next asked if the population genetics data supported the three
independent origins identified by genomic-based epidemiology. We
developed 11 SSR markers (Table 2) and genotyped kochia populations
collected from the three geographic regions to measure population-level
genetic diversity and genetic similarity of populations between
localities. Across all populations using Fisher’s combined probability
test, all SSR loci were in linkage equilibrium (P >
0.05), but not in Hardy-Weinberg equilibrium (HWE; P <
0.05).
Across all loci and populations, 3.98% of the data were missing (Table
S2). Of the loci, marker “SSR162” had the most missing data across
populations at 13.8%, whereas across loci, populations KS2S and KS8S
had the most missing data at 26.3% and 17.2% missing, respectively
(Table S3). For descriptive summaries and the neighbor-joining tree,
marker “SSR162” was removed as well as five individuals: KS2S_4,
KS2S_5, KS2S_7, KS2S_8, KS8S_2, as this locus had more than 10%
missing data and the individuals had more than 20% missing data after
“SSR162” was removed.
Allele counts, expected heterozygosity, and evenness for all loci across
all populations and then after the removal of individuals with missing
data are reported in Table S2. Descriptive summaries of 44 populations
at ten SSR loci are presented in Table 3 (data for all 11 SSR loci are
presented in Table S3). Populations ranged in their percentage of total
alleles observed and allelic richness from 57.7% and 1.42 (KS13R) to
24.7% and 2.61 (CO7R). F IS ranged from -0.04
(95% CI = -0.28 – 0.13; KS13R) to 0.58 (95% CI = 0.33 – 0.79; MT3R),
while most are in the range of 0.2 to 0.4. A positiveF IS indicates a deficiency of heterozygotes in
the population compared to the proportion expected in HWE and a negativeF IS indicates an excess of heterozygotes. TheF IS results should be interpreted with caution
noting that loci did not meet the assumptions of Hardy-Weinberg (Waples,
2015) and many confidence intervals spanned from negative to positive
and over very large ranges.
As loci and populations did not meet the assumptions of HWE, a
neighbor-joining tree was used to assess genetic similarity between
populations. This tree showed some expected groups by region, with 12
Central Great Plains populations grouped in a large clade supported
100% by bootstrap values (Figure 2). This clade also contained OR4R
(Pacific Northwest) and MT2R (Northern Plains). The STRUCTURE analysis
showed that K=3 was the number of clusters or gene pools best supported
(Figure S1) and also supported the grouping of OR4R and MT2R with the
Central Great Plains populations including CO1R, KS10R, and KS11R
(Figure 3). The populations from the Pacific Northwest largely clustered
together (OR2R, OR3R, OR6R, OR7R, OR9S, ID1R, and ID2R) with the clade
of populations OR9S and ID1R and OR7R and ID2R supported at 61.5%
(Figure 2). Populations KS13R, MT3R, and CO6R clustered with this
Pacific Northwest group (Figure 2) and had similarity in the STRUCTURE
analysis (Figure 3). Some groupings were unexpected, such as a grouping
of TX2R, TX3R, TX4R, and TX5R (Central Great Plains) populations with
Alberta, Canada (Northern Plains), as well as OR1R (Figure 2).