3.1 | Bioinformatics and population grouping
The total number of raw Illumina sequencing reads for the six plates was
2,983 million, or on average 497 million per plate. The proportion of
reads with a correct barcode and restriction enzyme cut site varied from
69% to 83% per plate with an average of 76%. Alignment to theS. nigrocinctus reference genome resulted in 79% overall
alignment rate, with the percentage of aligned reads per sample ranging
from 56 to 77% (mean = 71%). Filtering of individuals with high
percentages of missing genotypes (> 15%) and SNPs
with low genotyping rates (< 20%) resulted in the
final sample size of 398 individual fish (321 in 2014 and 77 in 2015)
and 11,146 SNPs.
The ancestry analysis revealed the presence of 4 discrete spawning
populations. sNMF ancestry analysis in LEA revealed 4 populations based
on cross-entropy criteria (Figure 2a-2b). PCA analysis supported the K=4
sNMF derived putative population clusters (Figure 2c). STRUCTURE
(Pritchard et al. 2000) analysis also supported K=4 populations, but
with greater admixture of population 2 and 3 than was estimated via sNMF
algorithm.All four of these populations were represented in both the
2014 and 2015 collections (Figure 3). PairwiseF ST values for genetic differentiation among
putative population-year combinations revealed consistent
differentiation between populations in each year (Table 1).
Additionally, this difference was conserved across years, meaning little
differentiation as measured by F ST was observed
within a population, between years. These findings support the results
of the ancestry analysis and provide evidence that the 2014 and 2015
collections are composed of similar mixtures of discrete spawning
populations.
Relatedness analysis showed no related individuals (up to half siblings)
in the collections. This indicates that the discrete sNMF derived
populations are not simply groups of closely related individuals.
Furthermore, the results of this analysis ensured that no related
individuals are included in the subsequent genotype-environment
association models, which is thought to result in higher false positive
rates due to lack of independence among the samples (Newman et al. 2001;
Voight and Pritchard 2005).
Fewer private alleles were detected in 2015 than in 2014 and this
pattern was significant when adjusting for the smaller sample size in
2015 (Table 2). This analysis was done separately for each sNMF-derived
population, and we detected private alleles in common among all four
populations (Table 3) indicating the same suite of alleles was not
detected in 2015. This suggests that 2015 selection was stronger as
compared to 2014, leading to loss of deleterious alleles in the 2015
cohort, which is consistent with the more abnormal oceanic conditions
observed in 2015 than in 2014 (Cavole et al. 2016; Gentemann et al.
2017; Jones et al. 2018).