Population genomics:

Filtered 75bp reads for each individual, generated via ddRADseq from Jacobs et al . (2020) and accessed from (ddRADseq NCBI short read bioproject: PRJNA607173), were mapped using bwa mem andSAMtools using settings described in that paper to theSalvelinus sp. genome from NCBI (ASM291031v2). The number of reads per individual ranged from 1 to 3.5 million. RAD loci were built in the gstacks module of Stacks v2.53 for 200 individuals (Awe_Bn= 26, Awe_Pl=29, Dughaill_Bn=28, Dughaill_Pl=27, naSealga_Bn=18, naSealga_Pl=20, Tay_Bn=21, Tay_Pl=31). SNPs were retained in the populations module of Stacks if they met the following criteria: present in 66% of all individuals in each population and across all populations, a minimum minor allele frequency of 0.05, maximum observed heterozygosity of 0.5. Each ecomorph within a lake was considered to be a discrete population. The script filter_hwe_by_pop.pl to filter out sites outside Hardy-Weinberg equilibrium within populations (available at https://github.com/jpuritz/dDocent). vcftools v0.1.13 was used to filter to a minimum coverage of 5x and a maximum of 50x. A principal component analysis was performed to identify the major axes of genetic variation using SNPRelate v1.22.0 (R package) .

Genotype-phenotype association analyses:

To determine the association between genotypes and phenotypic variation in head or body shape, we ran Linear Mixed Models (LMM) in Gemma v0.98.1 . Univariate and Multivariate LMMs with Wald’s test were run using PCs 1-5 for head shape and body shape, the SNP dataset generated for the population genomics analyses, and lake of origin as a co-variate . Missing genotypes were imputed using LinkImpute v1.1.4 and a relatedness matrix was generated using Gemma before running the models. We determined significant associations using bonferroni-corrected P-values (0.05/7329 unlinked SNPs) from the Wald’s tests. The number of unlinked SNPs was determined by LD-pruning the full SNP dataset using the snpgdsLDpruning function in SNPRelate . Bayesian Sparse Linear Mixed Models (BSLMM) were run using PC1-5 variables to determine how much of the phenotypic variation is explained by the SNPs in our dataset (PVE) and secondly how much of that variation is explained by large-effect loci (PGE), and finally how polygenic each phenotype is (⍴). Manhattan plots were made using CMplot v4.0 (R package) (https://github.com/YinLiLin/CMplot).
We subsequently determined if SNPs showing significant associations with head shape or body shape morphology were found within annotated genes in the Salvelinus sp. reference genome using BEDtools v2.27.1 . The functions of genes containing, or ± 1kbp of, associated SNPs were investigated using GO term overrepresentation analysis (ORA) and gene set enrichment analysis (GSEA). These analyses were run usingtopGO v2.40.0 (R package) with all genes containing any RAD loci as the full comparison dataset. Results were summarised usingREVIGO before visualisation in Cytoscape v3.91 .

Comparisons to known QTLs:

Using existing information on the genetics of important phenotypes from other salmonid species, we mapped a database of 1,338 QTL markers to theSalvelinus sp. genome. This was based on a previously published database of QTLs involved in traits related to morphology and life history, derived from a range of salmonid species and previously mapped to the Salmo salar genome . Additionally, a literature search was conducted up to April 2021 to augment the existing database with more recently published QTLs. This literature search was conducted in Web of Science and Google Scholar using the search terms “QTL”, “quantitative trait loci”, “salmonid”, and the common and scientific names for rainbow trout, Atlantic salmon, Arctic charr, lake whitefish, Chinook salmon, coho salmon, brook trout, and lake trout. QTL marker sequences were gathered for 17 different phenotypes: body length, body shape, body weight, Fulton’s condition factor, directional change, disease resistance, embryonic development, gill rakers, growth rate, hatching time, head shape, parasite resistance, salinity tolerance, sexual maturation, smolting, spawning time, upper temperature tolerance (Table S1).
Following Jacobs et al. (2017), the strategy of mapping the QTL-linked markers to the Salvelinus sp. genome depended on the QTL marker type: RAD loci were mapped using Bowtie2 v2.4.4 and the very sensitive pre-set; microsatellite primer sequences, which are shorter, were mapped using Bowtie v1.3.1 allowing for 3 mismatches. QTLs for which the flanking markers mapped to different chromosomes were removed. Redundant QTLs, i.e., where two QTLs for the same trait from the same species mapped to the same location, were removed only keeping the QTL with the higher PVE or LOD score (following . For QTLs where more than one marker was reported, we attempted to map all markers. Position values for the QTLs markers were then compared to positions of the phenotype associated SNPs using BEDtools , with a cut-off of ±100kbp. This value was used so that we could consider SNPs within the range to be proximal to a QTL peak while also accounting for the large size of many of the QTLs in the database. In total, we successfully mapped 669 QTL-linked sets of markers to theSalvelinus sp. genome after removing redundant QTLs (Table S2).

Genomic response to selection:

We investigated if the phenotype-associated SNPs identified in our analyses showed signals of a response to selection and if those signals were replicated across ecomorph pairs. To test this, for each ecomorph pair we compared FST and DXY values for phenotype-associated SNPs to a random background subset of SNPs. This random subset was 100 SNPs randomly selected from the whole dataset and the mean FST and DXY values for those SNPs were calculated. This was repeated 10,000 times and the means for FST and DXY were taken across all permutations. These permuted values were then compared to the empirical mean FST and DXY values for the phenotype-associated SNPs using the t.test function in R.

Analyses of recombination rate variation:

To test the effect of the recombination landscape on phenotype-genotype association, we first estimated recombination rates using the published Arctic charr linkage map (N=3,636) using MareyMap v1.3.6 . RAD loci from the linkage map were aligned to the Salvelinus sp. reference genome with Bowtie2 using the -very-sensitive setting. Loci were kept if they uniquely mapped to one location, mapped to the same chromosome as all other loci on their linkage group, and followed the orientation of the linkage map (i.e., not reversed). The filtered dataset was used to estimate the recombination rate across each chromosome using a spline algorithm. Spar values were varied for each chromosome from 0.5 to 0.9, depending on chromosome size, to best fit the data . Subsequently, WindowScanR v0.1 (available at: https://github.com/tavareshugo/WindowScanR) was used to summarize recombination rate values in 1 MB windows along the genome. All SNPs were assigned to these windows using BEDtools . A random subset of 100 SNPs was then selected and the mean recombination rate for those SNPs was calculated based on their windows. This was repeated 10,000 times to generate a background mean recombination rate, which was then compared to the mean recombination rate of the phenotype-associated SNPs (based on their windows) using a t-test.