INTRODUCTION
Climate change is expected to become a major threat to biodiversity this
century (Sala et al.,
2000; Urban, 2015), with cascading impacts on human well-being and
ecosystem function (Pecl et
al., 2017). Anticipating and mitigating these impacts requires
actionable predictions of expected biological responses, which are
expected to become increasingly difficult to anticipate under novel
climates of the future
(Fitzpatrick, Blois, et
al., 2018; Urban et al., 2016). The adaptive capacity of species
represents an important component of climate change vulnerability
(Dawson, Jackson, House,
Prentice, & Mace, 2011), yet few studies incorporate local adaptation
into forecasting models, while even fewer have attempted to compare
genomic predictions to actual organismal responses.
Recent technological advancements now provide access to massive
quantities of data pertinent to biodiversity science and conservation
(e.g., species occurrence databases, genome-scale DNA sequencing,
high-resolution projections of future climate;
Wüest et al., 2020). At the
same time, new sophisticated machine learning methods have emerged that
can take advantage of these data to identify conservation risks and
opportunities under a changing climate. In particular, the application
of machine learning to genomic studies of local adaptation represents an
especially promising frontier for improving our understanding of biotic
responses to climate change and the potential to consider climate
vulnerability at the population level
(Fitzpatrick &
Keller, 2015; Gougherty, Keller, Chhatre, & Fitzpatrick, 2020;
Savolainen, Lascoux, & Merilä, 2013).
Fitzpatrick & Keller (2015)
described how a machine learning method known as Gradient Forests (GF;
Ellis, Smith, & Pitcher,
2012) can be used to (1) analyze and map spatial variation in allele
frequencies as a function of environmental gradients and (2) project
patterns of genomic variation under future climate. GF derives
monotonic, nonlinear functions that characterize compositional turnover
in allele frequencies along each fitted environmental gradient. In
addition to identifying the primary environmental drivers associated
with genomic variation, these turnover functions provide unique insights
into the nature of how genomic patterns vary along multiple
environmental gradients, including where changes in allele frequencies
are rapid or slow across space. The turnover functions from GF also can
be used to transform (or rescale) the fitted environmental predictors
from their arbitrary anthropogenic measurement units (e.g., ℃ of
temperature or mm of precipitation) to common biological units of
compositional turnover (Ellis
et al., 2012). By transforming each of the predictor variables using
its associated turnover function, the multidimensional environmental
space can be converted into a multidimensional genomic space that
characterizes differences in the expected genetic makeup between
populations in different environments. By applying the turnover
functions to scenarios of environmental change, one can project expected
genomic patterns under future climate. The Euclidean distance between
the locations of each population in the current and future genomic
spaces characterizes the magnitude of expected change in genetic
composition for each population given the pattern of climate change in
each location. Fitzpatrick &
Keller (2015) termed this distance the “genetic offset”, which can be
viewed as a metric of the degree of expected maladaptation when a
population is exposed to rapid climate change, assuming no adaptive
evolution in situ or migration to allow adaptive alleles to track
climate change. Gougherty et
al. (2020) recently extended the genetic offset concept to consider the
contributions of climate maladaptation, migration, and the potential for
future novel gene-climate associations to the vulnerability of
climatically adapted populations.
Since the publication of
Fitzpatrick & Keller
(2015), a growing number of studies have used genetic offsets to
estimate climate maladaptation in a variety of species, including trees
(Gugger,
Liang, Sork, Hodgskiss, & Wright, 2018; Ingvarsson & Bernhardsson,
2020; Jia et al., 2020; Martins et al., 2018), birds
(Bay et al., 2018; Ruegg
et al., 2018), and agricultural crops such as maize landraces in Mexico
(Aguirre-Liguori,
Ramírez-Barahona, Tiffin, & Eguiarte, 2019). However, like projections
of species-level responses to climate change from species distribution
models, genetic offsets are in essence derived from a correlative,
space-for-time substitution approach
(Blois, Williams,
Fitzpatrick, Jackson, & Ferrier, 2013) that ignores the enormous
complexities underlying actual evolutionary responses of populations to
environmental change, including interactions between selection,
effective population size, and evolutionary processes shaping adaptive
variation (e.g. migration, mutation, recombination). Instead, the use of
genetic offsets assumes that, after correcting for neutral population
structure, correlations between allele frequencies and environmental
gradients reflect current patterns of local selection and relative
fitness and that these existing gene-environment associationsacross space can be used to project the magnitude of change in
allele frequencies expected through time to maintain
gene-environment associations at their current status quo. Very few
studies have tried to relate local adaptation analyses and associated
predictions to actual organismal responses. As such, genetic offsets
lack empirical validation, and it remains unknown what if any utility
the concept has for predicting the actual performance of populations in
novel environments.
Here we use machine learning, population genomic data, and common garden
experiments to provide an empirical space-for-time test of the extent to
which genetic offsets predict performance of populations in new
environments. We measured growth performance of trees collected from
climatically diverse populations which were clonally propagated in two
common gardens. For these same populations, we also obtained genome-wide
single nucleotide polymorphisms (SNPs) which were used in a series of
genome scans for local adaptation employing multiple methods to
determine outlier loci associated with climate. We then fit GF to the
different sets of candidate SNPs determined using the different outlier
detection methods and used these models to (1) identify the primary
environmental variables driving the signals of local climate adaptation
in the genome, (2) fit flexible functions describing how genetic
patterns vary along the gradients, and (3) predict genetic offsets
associated with transplanting individuals from their home climatic
environment to the climates they experienced at the common garden sites.
Specifically, we aim to address the following questions:
- How do GF models fit to different sets of statistical outlier SNPs
differ in terms of variable importance, turnover functions, and
predicted spatial patterns?
- How well do genetic offsets predict responses of populations
transplanted to new common garden environments and do genetic offsets
outperform naive ‘climate-only’ transfer distances?
- How sensitive is the predictive ability of genetic offsets to the
composition of SNP panels derived from different outlier detection
methods, or when randomly sampled from the genomic background?