Introduction
Aside from limits set by dispersal barriers, distribution range borders are commonly assumed to be the result of the constraints imposed by the ecological requirements of species (either biotic or abiotic), as environmental gradients may change towards suboptimal conditions near range edges (Hutchinson 1957; Brown 2002). All in all, the factors that shape these distribution borders not due to dispersal barriers are ultimately linked to local adaptation dynamics; simply put, a species does not occur outside its distribution border because it is not adapted to the environmental conditions beyond it (Kirkpatrick and Barton 1997; Bridle and Vines 2007). However, the edge of a species’ range is typically more abrupt than expected, given that environmental change towards suboptimal conditions or niche boundaries is usually gradual (Sexton et al. 2009). Moreover, all across their ranges species meet a range of conditions that is much greater than the environmental change that takes place at the edge of the range (Kirkpatrick and Barton 1997). To understand these seemingly arbitrary boundaries to range expansion, Haldane (1956) proposed gene ‘swamping’ as a center-border effect by which gene flow from central to marginal habitats causes maladaptation at the edges of the range, reducing population density and constraining range expansion. This dynamic pattern would jeopardize adaptation at the edge of the range even if the genetic variants that could promote range expansion are present in the genetic pool of a species, because gene swamping would hamper a rise in the frequencies of adaptive alleles at range limits (Haldane 1956). However, this hypothesis has been subject to continuous debate (Nosil and Crespi 2004; Sexton et al. 2011; Polechová 2018).
Another possibility is that range limits arise because a species has fully colonized the spatial projection of its ecological niche, in such way that niche expansion must precede range enlargement (Hutchinson 1957). In such cases, since niche expansion implies adaptation to more extreme conditions along one or more environmental gradients, this process is limited by the magnitude of the additive genetic variance associated with adaptation to these gradients (Lande and Shannon 2006). Conversely, if habitat and/or environmental suitability remains high at and beyond range boundaries, then dispersal constraints, gene swamping and/or marginal demographic effects could be defining the location and shape of distribution limits (Kirkpatrick and Barton 1997; Bridle and Vines 2007; Charlesworth 2009; Peterson 2011, Duncan et al. 2015). However, if the genetic variability required for range expansion is not available in the genetic pool of a species, gene swamping cannot be invoked to explain range limits (Polechova and Barton 2015, Polechova 2018). Evaluating these different hypotheses is thus crucial to understand the adaptive causes underlying the formation and shaping of range edges (Sexton et al. 2009; Lee-Yaw et al. 2018).
Landscape genomics approaches have boosted our understanding of how environmental variables drive the genetic dynamics of local adaptation (Hoban et al. 2016; Ahrens et al. 2018). These methods can be applied to model (and predict) potential range boundaries by looking at the shifts in allelic frequencies along environmental gradients (Eckert et al. 2008; Herrera and Bazaga 2008, Razgour et al. 2019). Thus, it is possible to explore what loci govern the adaptability of a species, and to model the suitability of certain genotypes to different habitats all over a species’ range (i.e., environmental association analyses; Rellstab et al. 2015; Whitlock and Lotterhos 2015). However, describing correlations between genotypes and environmental gradients is only one part of the challenge, because some loci could show strong but spurious associations with environmental gradients due to population history rather than natural selection. Thus, it is also paramount to identify the loci underpinning local adaptation, since they make the fraction of genetic variation that is relevant to explain an individual’s ability to disperse to, and thrive in, new habitats (Dudaniec et al. 2018). Identifying these combinations of loci under selection is a prerequisite to understand the adaptive basis of the origin and maintenance of new populations and, therefore, the genetic dynamics that shape range boundaries (Hargreaves et al. 2014).
New populations of a species can be established either by 1) the arrival of individuals carrying genetic adaptations to that new site, or 2) the arrival of genetic variants that can recombine in situ to generate new locally adapted genotypes (Barton and Etheridge 2018). Discerning between these two possibilities is a hard challenge. In particular, the last scenario is controversial because it assumes that an individual is able to reproduce in a location to which it is not adapted. However, the potential to produce new genetic combinations increases with dispersal rate, by rising the probability that different (suboptimal) genotypes eventually co-occur at the same new habitats (Barton and Etheridge 2018; LaRue et al. 2018). Thus, study organisms with low dispersal rates should reduce the confounding effects of dispersion, allowing us to focus on local adaptation dynamics as responsible of expansion constraints (Lee-Yaw et al. 2018).
In this study, we integrate genomic data into the distribution modelling of a lacertid lizard species, the Large Psammodromus Psammodromus algirus , whose phylogeographical and ecological differentiation is well characterized (Carranza et al. 2006, Díaz et al. 2017, Llanos-Garrido et al. 2019). This lizard is widespread across the Western Mediterranean region, and its range encompasses contrasting environmental conditions, extending from northern Africa in the south to southwest France in the north, and from Portugal in the west to Tunisia in the east (Fig. 1A). We used 21 loci putatively under selection (hereafter outliers; Llanos-Garrido et al. 2019) to model distribution boundaries on the basis of five closely located central populations that cover a representative fraction of the environmental variation faced by P. algirus across its entire distribution range. These outlier SNPs were identified by two methods of outlier detection, one of them independent of population coancestry (an extension of the Lewontin–Krakauer test; Bonhomme et al. 2010) and the other one not (Bayescan v.2.1; Foll and Gaggiotti 2008). To model the distribution of P. algirus from these SNPs putatively under selection, we ran an environmental association analysis (Rellstab et al. 2015) with allelic variants at loci under selection as predictors, and we extrapolated, for all possible allelic combinations at those loci, the geographical locations with suitable environmental conditions. By doing so, we were able to infer not only an ecological niche model of the whole distribution range of the species, but also the genotypes that would potentially be adapted to each geographical grid cell within it. We assumed a simple model without center-border biases, and in which every genotype is able to reach every geographic cell. Also, we only used the (adaptive) genetic variation at those particular loci likely to be associated with environmental conditions, in such way that we could know whether actual range limits are linked to adaptability thresholds determined by the amount of additive genetic variance available for selection. This approach allowed us to test whether a species’ distribution range can be explained by the genetic dynamics that shape local adaptation, without invoking demographic processes such as gene swamping or increased homozygosity near the edge of the range (Herrera and Bazaga 2008; Polechová et al. 2009).
Specifically, we aimed to answer the following questions: 1) Is it possible to infer an entire distribution range on the basis of a limited number of outlier loci? 2) How important are limitations to dispersal in defining a species’ range limits? 3) Are there fewer genotypes adapted to marginal conditions than to core conditions? And 4) is there an adaptability threshold, determined by the availability of genetic variance under selection, that constrains the expansion of the range beyond its actual boundaries?