4.3 Scans for the genomic signature of selection
WGR has more commonly contributed to reverse-genetic approaches, where
whole-genome scans are used to identify loci that have been subject to
selection without directly knowing the traits involved. In this way,
inferences can be made about the genetic basis and evolutionary history
of adaptation even when the ecology and life history of an invasive
species is poorly understood. There are various ways to identify the
signature of natural selection from genomic datasets. When studying a
single invasive population, the footprint of a selective sweep can be
identified from the site frequency spectrum of genetic variation
(DeGiorgio, Huber, Hubisz, Hellmann, & Nielsen, 2016; Nielsen et al.,
2005). Alternatively, comparisons between populations (e.g. ,
between different timepoints during invasion or between native and
invasive populations) can be used to identify regions of high
divergence, using summary statistics such as FSTor the population branch statistic (Yi et al., 2010). Another rarely
exploited approach to measuring adaptation in invasive species is the
use of sequence data collected in a time series – analogous to
‘evolve-and-resequence’ experiments carried out in laboratory
populations (Long, Liti, Luptak, & Tenaillon, 2015; Schlötterer,
Kofler, Versace, Tobler, & Franssen, 2015). Not only can this approach
be used to rule out pre-invasion adaptation (see Part 1.1), it also
provides a powerful framework in which to identify allele frequency
shifts resulting from simple or polygenic adaptation (Buffalo & Coop,
2020; Otte & Schlötterer, 2020). Where samples are not readily
available from early timepoints in the invasion, historical museum or
herbarium samples can be used to infer past allele frequencies (Bi et
al., 2019; McGaughran, 2020). All the approaches mentioned above can be
used with SNPs, transposable elements or structural variants, which are
readily detectable using WGR data and difficult to measure using other
sequencing technologies (Bertolotti et al., 2020).
There are already a handful of examples of selection scans being used in
invasion biology. For example, a genome-wide scan for association with
invasiveness in 16 invasive and 6 native populations of Drosophila
suzukii identified SNPs in two genes associated with independent
invasion routes (Olazcuaga et al., 2020). Using a similar approach that
controlled for population structure, genome scans across the global
distribution of P. xylostella identified three potentially novel
insecticide resistance alleles (You et al., 2020), and signatures of
positive selection were associated with sugar receptor genes in inHyphantria cunea (mulberry moth) (Wu et al., 2019). Other studies
have made use of WGR data by identifying structural variants and
transposable elements, investigating their effect on fitness in invasive
populations. For example, again in Drosophila suzukii , fifteen
putative adaptive transposable elements were identified, one of which
was 399bp from a SNP previously associated with invasion success in this
species (Mérel et al., 2020). In this way, WGR can identify an otherwise
invisible dimension of genetic variation.
It is has long been realised that genome scans for selection need to
account for background genomic processes that can lead to false
positives for adaptive loci. These can include genetic drift caused by
demographic changes and selective processes such as background
selection. In some cases, the peculiar biology of invasive species makes
them especially prone to such problems, as genetic bottlenecks can lead
to signatures of reduced variation similar to those caused by selection
(see Box 2). Furthermore, any summary statistic capturing the coalescent
process will be influenced by variation in recombination rate (c )
and changes in the effective population size (Ne )
(Barton & Etheridge, 2004; Booker, Yeaman, & Whitlock, 2020; Brandvain
& Wright, 2016). Ne and c can be
estimated empirically with WGR data. Changes inNe can be inferred using demographic inference
methods (see Part 3.1), while recombination rate variation along the
genome can be estimated by constructing a linkage map or with phased WGR
data (Chan, Jenkins, & Song, 2012). User-friendly modelling tools, such
as SLiM , can be used to explore the expected distribution of
summary statistics under various combinations ofNe and c (Haller, Galloway, Kelleher,
Messer, & Ralph, 2019; Haller & Messer, 2019). Tests for selection
that explicitly incorporate demography and recombination can also be
used (e.g. , Luqman, Widmer, Fior, & Wegmann, 2020). Therefore,
despite the confounding effects of recombination rate variation and
demographic history on the summary statistics used in genome scans, it
is now easier than ever to identify and account for these effects.