DISCUSSION
The primary goal of this study was to provide an experimental test of the extent to which genetic offsets, a correlative space-for-time approach, can predict performance of populations exposed to new environments. By transplanting individuals from their home environment to the novel climate of the common gardens, we substituted space for time as a proxy for rapid climate change. We found that genetic offsets based on existing gene-environment relationships work well to predict performance of populations experiencing new environments - and much better than climate differences alone (Table 1). We view this finding as encouraging preliminary evidence that genetic offsets may represent a first order estimate of the degree of expected maladaptation of populations exposed to novel environments. While our study considered climate differences across geographic space, in principle our findings should be relevant to temporal changes in climate as well. As such, genetic offsets could provide a means to estimate aspects of population-level vulnerability to climate change. Additional research is warranted to determine the extent to which our findings are generalizable to other systems and populations growing in natural environments.
That genetic offsets outperformed naive climate distances is not surprising and can be best understood by considering the similarities and distinctions between these two methods. In many ways, genetic offset share a conceptual foundation with climate transfer distances long used in forestry (Mátyás, 1996). The establishment of provenance trials, in which tree seed from multiple locations are collected and grown in multiple sites, has allowed for evaluation of tree performance as a function of differences in climate between sources and planting sites (i.e., response functions derived from climate transfer distances; Wang, Hamann, Yanchuk, O’Neill, & Aitken, 2006; Wang, O’Neill, & Aitken, 2010). These experiments provide excellent insight into the climate variables that best predict phenotypic performance upon transfer to a new site, but are time and labor-intensive, and not practical for most study systems. A simpler approach is to delineate climate-based seed zones from which seeds should be selected for restoration under the hypothesis that maladaptation of seedlings is minimized (and production is maximized) when movement of seeds is restricted to other sites with similar climate (e.g., Bower, St Clair, & Erickson, 2014; Pike et al., 2020). The distinction between the “traditional” climate transfer distances used for seed zone delineation and genetic offsets is simply that genetic offsets use re-scaled climate distances based on the modeled associations with (adaptive) genomic variation, whereas climate distances typically weigh the included variables equally despite potential variation in their adaptive importance. Existing gene-environment relationships described by the fitted turnover functions from GF provide the mechanism that allows proper weighting of different climate variables, based on how allele frequencies are aligned with climate gradients. Gradients strongly associated with genomic variation (and portions of these gradients where genetic patterns change most rapidly) will have greater contribution to genetic offsets than will unimportant variables (or portions of gradients where allele frequencies generally are constant; Capblancq, Fitzpatrick, Bay, Exposito-Alonso, & Keller, 2020; Fitzpatrick & Keller, 2015). This also fits well with a recent study in lodgepole pine, (Pinus contorta ), in which the climate variables identified as important in GEA models were strongly correlated (r = 0.9) with the climate variables associated with phenotypic performance in a 20-year provenance trial (Mahony et al., 2020). This suggests that one of the realized benefits of GEA may be in identifying which among a set of climate variables are most predictive of local adaptation, which is the same principle being employed by GF to weight different climate variables based on the strength of the genomic association when calculating genetic offsets. The use of GEA plus genetic offsets may prove useful for conservation planning in long-lived species or those for which phenotypic information from experimental assessment of field performance is lacking.
The finding that genetic offsets had good predictive power regardless of whether they were based on sets of outlier SNPs or simply SNPs selected at random from the genome (which surprisingly slightly outperformed genetic offsets based on outlier SNPs) is harder to explain. One explanation is if allele frequencies of the genome as a whole tend to be aligned with the same environmental gradients that are important to local adaptation (i.e., the gradients of adaptive and neutral genomic background are parallel or proportional), then one could serve as an adequate proxy for the other. If this is the case, then SNPs selected at random should provide the same rank weighting of the climate gradients as would outlier SNPs, which was generally the case in our study (Fig. 2, Supplementary Fig. S3). However, as mentioned above, the shapes of the turnover functions also will influence genetic offsets. All else being equal, larger genetic offsets will occur for populations transferred between environments on either side of a threshold as compared to populations transferred along flat portions of allele turnover gradients. Assuming these nonlinearities reflect true signals of local adaptation, we would then expect genetic offsets that incorporate these patterns to outperform linear methods that do not. Our findings do not support this expectation. In this study, the turnover functions based on outlier SNPs often showed pronounced nonlinearities, whereas those based on randomly sampling SNPs from the genomic background tended to be more linear (Figs. 2, 3), yet genetic offsets based on outliers tended to be strongly correlated with those from random SNPs (Supplementary Figs. 6). Further, random SNPs slightly outperformed outlier SNPs in explaining height growth in the common gardens. Additional research is required to determine whether this result is an artefact of our study or a more general pattern.
Another primary goal of our study was to explore differences between GF models fit to different sets of outlier SNPs. There are numerous ways to detect statistical outlier SNPs, and, as was the case in this study, it is not uncommon for different methods to identify different SNPs as outliers, leaving some uncertainty regarding which SNPs are false vs. true positives, and therefore which SNPs truly are associated with climate adaptation and thus most informative from a predictive standpoint. By fitting GF models to different sets of outlier SNPs, we can ask: To what extent do different sets of outlier SNPs produce different inferences? We found that although the different outlier methods detected different sets outlier SNPs (Fig. 1), GF models fit to different sets of outliers from bayenv2 , lfmm , and GF-X were similar in terms of variable importance ranking (thoughR 2 values differed, Supplementary Figs. 3 and 5), the general shapes of the turnover functions (Figs. 2, 3), and therefore, the predicted spatial patterns of genetic variation (Fig. 4), and by extension, the predicted genetic offsets (Supplementary Fig. 6). GF models fit to SNPs selected at random or those selected using allele frequencies uncorrected for population structure (GF-Raw) also generally followed the same pattern of variable importance ranking as other outlier detection methods, but given that these methods selected a large proportion of unique SNPS, they produced turnover functions and predicted spatial patterns that differed from each other and frombayenv2 , lfmm , and GF-X . The similarity in variable importance ranking and predictions from different sets of outlier SNPs would arise if (1) the outlier SNPs they shared in common tended to have strong relationships with climate (and therefore would have greater contribution to the fitted turnover functions from GF; (Ellis et al., 2012) and/or (2) the outlier SNPs unique to each method tended to have similar relationships (i.e., shapes of turnover functions) with climate. We have evidence for both possibilities. The shapes and cumulative importance of the turnover functions for the outlier SNPs unique to bayenv2 ,lfmm , and GF-X were similar (Fig. 3) and the totalR 2 from GF models increased for SNPs as their outlier status was shared among an increasing number of detection methods (Supplementary Fig. S7). Outliers unique to a single method likely represent a mix of false positive SNPs along with some true positives that may be better detected by one method over another, although these are difficult to separate in real data. Our experimental design and sampling strategy were specifically chosen to minimize false positives arising from demographic history, and our simulations testing GF-Raw suggested a low type I error rate under a simple scenario of isolation by distance. However, under more complex demographic histories we would expect GF-Raw to be prone to false positives because it does not have an internal control for neutral population structure. Given this, and the observed reduction in unique outliers identified by GF before and after correcting for population relatedness (i.e., GF-Raw vs. GF-X ), we advocate fitting GF only to allele frequencies that have been properly corrected for demographic history.
In terms of outlier detection, it is notable that GF-X detected the fewest outliers overall and the fewest outliers unique to that method (Fig. 1). Unlike bayenv2 and lfmm , GF is multivariate, can accommodate interactions between variables, and assumes no parametric form of the allele frequency ~ environment relationship (although it does assume monotonicity). Therefore, GF-X may be less prone to the multiple testing problem inherent in univariate methods, or to outlier loci arising due to departures from the assumed linear model. As such, the combination of GF run on standardized allele frequencies produced by bayenv2 as done in this study (GF-X ) could provide a more holistic approach to multivariate outlier detection that is robust to the shape of the allele frequency ~ environment relationship, while also correcting for finite sampling and population structure. Because GF reports an R 2 for each predictor variable in the model as well as for the model as a whole, it also provides a means to consider outlier status from the context of individual climate gradients as well as more comprehensively. Taken together, we feel GF warrants further study as a useful outlier detection method under simple demographic histories, or when provided with allele frequencies that have been corrected for population relatedness; especially for systems under strong, linear selection and intermediate migration (supplementary Figure S2).
While still a new and largely untested method, GF is increasingly being applied to genomic studies, including quantifying population-level climate change vulnerability. However, concerns have been raised about the application of genetic offsets in this capacity, especially for mobile organisms with short generation times (Fitzpatrick, Keller, et al., 2018). Common garden experiments are not perfect proxies for climate change or organisms in natural environments, but our results suggest that existing genetic patterns across space and associated genetic offsets may be informative for predictions across time as well - even if these predictions are based on neutral genetic patterns. Given the inherent complexities, for most any organism it will be challenging to predict the exact nature of genomic change in response to environmental change. However, for some organisms, it may be possible to use existing gene-environment relationships to develop adequate assessments of the magnitude of expected genomic change based on genetic offsets, which can provide a proxy for population-level exposure to climate change.