RESULTS
GF outlier detection – Simulations: Testing GF without any correction for population structure (equivalent to GF-Raw) against simulated scenarios of 1D isolation-by-distance showed that under linear selection, GF had good power (>0.8 at 𝛼 = 0.05) to detect loci under moderate to strong selection for most migration scenarios, although power was reduced somewhat under moderate selection with high migration (s = 0.1, u = 8) (Fig. S2a). Under weak selection (s = 0.01), GF was under-powered to detect selection under all migration scenarios. Under non-linear selection (Fig. S2b), power was also generally low (<0.5) for all but strong selection (s = 0.2) and low to moderate migration (u = 2 or 4). However, the false positive rate was well calibrated between 0.04-0.06 for 𝛼 = 0.05 (Fig. S2c). Thus, under this specific scenario of 1D isolation by distance, GF had good power to detect moderate to strong linear selection or strong nonlinear selection with low frequencies of false positives.
GF outlier detection – Empirical : Out of 107,309 high-quality SNPs, 23 (0.02%) were identified as statistical outliers by all four outlier detection methods (Fig. 1). GF-X detected the fewest number of outliers (120), had the smallest number of outliers unique to that method (22), and therefore shared the largest proportion (98/120=81.67%) of statistical outliers with one or more of the other outlier detection methods. In contrast, Bayenv detected the largest number of statistical outliers (320) and GF-Raw had the largest proportion (234/291=80.41%) of detected outliers unique to that method.
GF modeling of SNP outliers - Of the 320 outlier SNPs detected using Bayenv, 242 (75.62%) had an R 2greater than zero in the GF model. This compares to 146 of 310 (47.1%) outlier SNPs for LFMM and 42 of 71 (59.2%) outlier SNPs for Bayenv-LFMM (note that by definition all (100%) GF-Raw and GF-X outliers had an R 2 greater than zero). On average 49.64 of 500 (9.93%) SNPs had an R 2 greater than zero in the 999 GF models fitted to randomly selected SNPs.
Latitude was the most important predictor for all sets of SNPs (both outliers and random), followed by winter temperature (bio11), whereas elevation and diurnal range (bio2) were the least important variables (Supplementary Fig. S3). GF-Raw had the strongest associations (highestR 2 of all models) with all variables, and therefore the aggregate turnover functions for GF-Raw attained the greatest maximum height for all variables (Fig. 2). Random SNPs had the weakest associations (lowest R 2 of all models) for all variables except elevation and diurnal range (bio2), for which Bayenv-LFMM had a lower R 2. Although the aggregate turnover functions differed in their maximum height, reflecting differences in variable importance, most of the aggregate turnover functions based on outlier SNPs had a similar shape, with thresholds falling in the same general region of the gradients (Fig. 2). The GF-Raw and GF-Random turnover functions were notable exceptions to this pattern. Unlike the aggregate turnover functions for the five sets of outlier SNPs, which exhibited pronounced thresholds, the turnover functions for SNPs selected at random largely lacked thresholds and instead turnover tended to be relatively constant along the seven environmental gradients. For GF-Raw, SNP turnover was more rapid at the colder and drier portions of the temperature and precipitation gradients than other sets of outlier SNPs, reflecting the substantial differences in patterns of turnover in the individual outlier SNPs uniquely detected by GF-Raw (Fig. 3, Supplementary Fig. S4). Integrating across all environmental predictors, the total R 2distribution across SNPs showed marked differences among different outlier detection methods (1-way ANOVA: F = 152.18; df = 3, 823;P < 0.0001), with the highestR 2 values coming from GF-Raw and GF-Xand lower R 2 values from outliers detected bybayenv2 and lfmm (Supplementary Fig. S5).
Spatial patterns of genomic variation - The GF models fit to different sets of SNPs produced different predicted patterns of genomic variation (Fig. 4). The most similar mapped predictions were between GF models fitted to outlier SNPs from Bayenv, LFMM, Bayenv-LFMM, and GF-X . Differences in predicted spatial patterns were greatest between GF-Raw and all other sets of outlier SNPs, followed by GF fit to SNPs selected at random, with the largest range-wide differences being between GF-Raw and GF-X . Differences in mapped patterns were generally greatest in the southern third of balsam poplar’s range and for most comparisons reached a maximum in a latitudinal band centered near 50° N and in trailing range edge populations in the Rocky Mountains.
Genetic offsets & climatic transfer distances - Because GF-X had the largest proportion of outliers that overlapped with other detection methods (and conversely, the smallest proportion of unique SNPs), here we report results for GF-X only. Northwesternmost populations, most distant from VT were predicted to have the largest genetic offsets associated with transplanting populations from their home environment to the VT common garden (Fig. 5a). The pattern of predicted genetic offsets was largely reversed for transplanting populations to the IH common garden: populations in the southeasternmost portion of the range, farthest from IH, were predicted to have the largest genetic offsets (Fig. 5c). This resulted in a highly significant negative correlation for the genetic offsets between the two garden sites (r = 0.897, df = 40, P <0.0001; Supplementary Fig. S6). In contrast, climate-only transfer distances (i.e., genetically-naive climate distances based on Mahalanobis distance) showed no clear cline with distance from the common gardens (Fig. 5b, d), and in fact climate-only distances showed a weak but positive correlation across gardens (r = 0.380, df = 40, P= 0.013; Supplementary Fig. S6).
Plotting the populations and the common gardens in the transformed multidimensional genomic space and the untransformed multidimensional environmental space reveals the locations of populations relative to the common gardens in terms of expected genomic similarity (Fig. 6a) and climatic similarity (Fig. 6b), thereby providing a means to conceptualize genetic offsets and climate transfer distances (though in only two of the seven dimensions as variation along additional axes is not shown). Consistent with variable importance ranking, latitude (y) and winter temperature (bio11) have the strongest contribution to variation in the multidimensional genomic space (as indicated by the length of the vectors in Fig. 6a). Shading indicates the degree of expected similarity of genetic patterns, with locations with similar shading being expected to have similar genomic composition. Numerous populations are predicted to have similar genomic patterns as those for the climate of the VT common garden. These populations plot near the VT common garden in the transformed genomic space and therefore have lower predicted genetic offsets for movement to VT common garden climate. In contrast, all seven variables have roughly equal contribution to variation in the untransformed environmental space (Fig. 6b) and the locations of populations and their distances from the common gardens reflects climatic similarity rather than underlying genomic patterns. For example, SSR is located within the unique higher elevation climate space (Fig. 6b), despite having predicted genetic composition similar to some eastern populations (Fig. 6a).
Genetic offset prediction of common garden performance - Genetic offset was significantly associated with the realized performance of populations transferred to the novel environments of the common gardens. Genetic offset models explained >60% of the variation in height increment growth (Table 1). Consistent with predictions, height growth was highest for populations experiencing the lowest values of genetic offset and declined with larger values of offset (Fig. 7). The shape of the height-offset relationship was non-linear, represented by a significant quadratic effect (Table 1), and exhibited the steepest decline as offset increased above zero followed by a flattening out at larger genetic offset values. Surprisingly, the estimates of genetic offset made from the random selection of SNPs from the genomic background were just as good or slightly better (R 2 = 0.66) than genetic offsets based on outlier loci (R 2 = 0.61-0.63). Climate-only distance had a negative linear association with height growth, but was a weaker predictor overall, explaining a bit more than half the variance in growth compared to genetic offset models (R 2= 0.34).
For the subset of populations that were phenotyped in both VT and IH (N=9 of 41), we observed a clear rank order change and crossing reaction norms in the genetic offset predictions, indicative of a tradeoff in the locally adaptive gene-climate relationship across sites (Fig. 8). Consistent with the prediction of a tradeoff, the height growth of populations tended to increase or decrease in a trend that was inverse to the change in genetic offset across sites, although without consistent change in the rank-order of populations. Accordingly, the per-population difference in height growth between sites (VT minus IH) was negatively correlated with difference in offset (Spearman’s rho = -0.6, P 1-tailed = 0.048), although with only 9 populations statistical power was limited.