Range inference accuracy
Our approach combines the advantages of correlational and mechanistic distribution models (Holt 2009, Kearney and Porter 2009), because it relies on genotype-environment correlations based on the geographical distribution of field-captured individuals, but deals with SNPs putatively under selection that should be ultimately correlated with functional differences in morphology, physiology, and/or behavior (see also Brown et al. 2016, Razgour et al. 2019). This should allow us to enlarge the scale of our analysis to the species level (Buckley 2010), because the extrapolation of our results to all possible allele combinations at loci under selection should cover a much wider range of adaptive phenotypic variation than the one revealed by the physiological measurements of a restricted sample of individuals or populations, and (more importantly) by a priori decisions about what traits may be adaptive across an entire species’ range. The only assumption behind this assertion is that all allelic combinations are actually plausible, as it should be expected if loci are correctly pruned by linkage disequilibrium. Moreover, as long as local adaptation leads to a heterogeneous distribution of the genotypes adapted to different parts of the range, our GEAM should be more realistic than mechanistic models, because such models are based on physiological measurements of individuals, and these may lack the specific adaptations required to thrive in specific habitats different from their own (such as the ones that determine range boundaries; Svardal et al. 2015, Brown et al. 2016, Razgour et al. 2019). However, the primary aim of this study is not to propose a new approach to simply infer species range (as there are many others more appropriate for this task), but to unveil the potential role of genetic variability as a major driver of range shape. Thus, the accuracy of this approach is only a symptom of the importance of such role, as the underlying model was built on the sole basis of the geographical distribution of SNPs under divergent selection.
Genotype-based range inference was especially accurate at the southern edge of the species’ range, including a precise delimitation of range gaps in Morocco, where detailed chorological information is available (Bons and Geniez, 1996). However, we could not test the accuracy of our model for the rest of North Africa due to the lack of detailed distribution maps of Psammodromus algirus in this area. The only information about these locations was obtained from the IUCN Red List database (which does not provide data about within-range gaps) and the GBIF database (which has only 33 records distributed among 16 locations in this area [Fig. 1A], all of which except one were predicted by our model). Nevertheless, the distribution borders suggested by these databases were accurately predicted by our range inference.
Regarding northern boundaries in the Iberian Peninsula, where detailed data on the presence of P. algirus is also available (Pleguezuelos 1997), we did not recover the presence of the species in a relatively large NW area where lizard populations do occur, inhabiting suitable habitat patches near the cool, humid end of the tested environmental gradient. Across southern France, the real distribution range of the species does not exceed the Rhône River delta, a possible geographical barrier which could not be predicted by our method of range inference (see Carranza et al. 2006 for an explanation about how and when the species crossed another important, potential geographical barrier, namely the Strait of Gibraltar). However, in the Iberian Peninsula our model successfully recovered the central and eastern parts of the northern range boundary, as well as several within-range gaps associated with mountain ranges (around central plateaus and river valleys) and arid regions in south eastern Spain.
The asymmetry in accuracy when predicting northern vs. southern edges could be explained if our outlier analyses failed to capture all the genetic variation under selection that is associated with such gradient (see below). Another possible explanation is that the relative role of climatic variables in determining northern vs. southern edges of the species’ range could be asymmetric (Cunningham et al 2015). Ectotherms tend to show this asymmetric patterns by underfilling their potential range at the southern/warmer edge while overfilling it at the northern/colder one (Sunday et al. 2012). However, this is not the case of P. algirus , because we did not predict any suitable habitat beyond the actual boundaries of the species’ range, but an underfilling at its northern edge. Simply put, it seems that P. algirus is able to overcome the predicted cold constraint, even when previous studies have found that cold tolerance is the most limiting climatic factor for range expansion in ectotherms (St Clair and Gregory 1990, Sunday et al 2014). Independently of the biotic or abiotic nature of that unknown northern constraint, the resulting inferred range would exceed the actual border, whereas the opposite is true for our model (Fig. 3 and Fig. 4). Thus, it seems that some unknown variables, unrelated to either temperature or humidity, may be enabling the occurrence of P. algirus beyond our predicted northern boundaries (Guisan et al. 2006, Pearman et al. 2010, Wisz et al. 2013).