DISCUSSION

This study presents a fundamental analysis of the drivers of diversification and speciation in a major group of seabirds, by constructing the first MSC time-calibrated species tree and biogeographical analysis for shearwaters based on a fully resolved phylogeny. This allowed us to explicitly explore the paleoceanographic events that may have driven the diversification of the group, as well as to infer their ancestral range using formal biogeographic analyses and evaluate the role of dispersal, vicariance and founder events in shearwater diversification. We also discuss the role of body size in shearwater diversification, and we consider potential ecological and evolutionary forces that may have shaped its evolution. Lastly, we used the evidence uncovered here to explain incongruences between the current taxonomy and the patterns of genomic divergence.

Limitations of divergence time estimation using SNAPP

A limitation of the SNAPP approach for absolute divergence time estimation is the assumption of equal and constant population sizes on all branches of the phylogeny (Stange et al., 2018). This assumption is clearly violated in our analysis, as shown in the SNAPP results without any age constraint (Figure S3). As a result, our divergence times might be slightly overestimated for lineages with larger population sizes than the overall estimation and vice versa. An additional limitation is the heterogeneity in substitution rates among shearwater lineages (Ferrer-Obiol et al. under review), which would likely benefit from the use of a relaxed clock (Drummond, Ho, Phillips, & Rambaut, 2006; Rannala & Yang, 2007) instead of the strict clock model implemented by (Stange et al., 2018). Nonetheless, previous analyses to select a clock model for this PE-ddRAD dataset showed that the strict clock model obtained the best marginal likelihoods (Ferrer Obiol et al. under review). We therefore do not expect this limitation to significantly reduce accuracy. Despite these limitations, given the relatively high overall population size estimated in this study, and given the shallow timescales encompassed by the shearwater phylogeny, we argue that the older divergence times estimated by concatenation analyses (Table S2) are most likely caused by a higher degree of node age error in the latter analyses, potentially caused by failing to fully consider the role of ILS (Angelis & Dos Reis, 2015; Ogilvie, Bouckaert, & Drummond, 2017; Ogilvie, Heled, Xie, & Drummond, 2016).

Biogeographic history of shearwaters

Our biogeographic analyses indicate that vicariance and founder events are probably the main mechanisms of speciation in shearwaters, as expected by their global distribution and high mobility. Unlike other Procellariiformes (Friesen, Smith, et al., 2007), sympatric speciation has not been described in shearwaters. Indeed, very few records of sister species inhabiting the same island exist in the wild and are limited to marginal overlaps between parapatric species (Navarro, Forero, et al., 2009). The biogeographic analyses suggest that shearwater dispersal is favoured by surface ocean currents; nevertheless, we cannot draw firm conclusions given the reduced differences in log-likelihood (< 3 units) between ancestral range estimation models with or without a dispersal matrix that restricted dispersal to areas connected by surface ocean currents (Table 1). Several studies have shown that winds are a major determinant of foraging ranges and migratory routes of seabirds, especially in the Procellariiformes (González-Solís et al., 2009; Weimerskirch, Louzao, de Grissac, & Delord, 2012). Winds are also a primary driver of surface ocean currents; hence, our study suggests that winds could also be an important determinant of species dispersal in the Procellariiformes.
Ancestral range estimation analyses inferred the South Australia - New Zealand area as the ancestral region of shearwaters with the highest support followed by the Northern and Tropical Pacific (Figure S4). The South Australia - New Zealand area is currently a hotspot of global seabird biodiversity (Croxall et al., 2012) and has the greatest number of shearwater species breeding in any single area (Dickinson & Remsen, 2013). On the other hand, the coast of California harbours the highest diversity of shearwater fossils from extinct species and some of the oldest ones (Miller, 1961). These observations suggest that the current biogeographic analyses represent a more probable hypothesis of the ancestral area of shearwaters than previous hypotheses, which suggested that the North Atlantic was the ancestral area based on the relatively rich shearwater fossil record in this area (Austin, 1996; Kuroda, 1954). The phylogenetic position of the oldest North Atlantic shearwater fossil species (P. raemdonckii and P. arvernensis ) is still unclear (Olson, 1985) and the age of P. micraulax, which was believed to be the oldest shearwater fossil species (lower Miocene, Hawthorne Formation, Florida) is uncertain (Ferrer-Obiol et al., in review). Thus, earlier suggestions of the North Atlantic as the ancestral area of shearwaters may have been misled by these uncertainties in the fossil record.
The MRCA of Calonectris had a North Pacific and North Atlantic distribution. Fossils of at least 5 species have been described from the North Atlantic dating back to ~14 Mya (Olson, 2008; Olson, 2009; Olson, & Rasmussen, 2001), supporting this area as a speciation hotspot for the genus. However, considering the mobility of the genus and given that the MRCA of Calonectris was probably a long-distance migrant (Figure S6), we can not eliminate the possibility that the regions where these fossils were found were not the breeding areas for the species. The estimated divergence time (~3.54 Mya) between the North Pacific and the North Atlantic clades is very similar to previous estimates based on mtDNA rates (~3.44 Mya; Gómez-Díaz et al., 2006) and suggests a vicariant event as the result of the gradual closure of the isthmus of Panama, as has been observed in other marine organisms (Lessios, 2008).
Our analyses indicate that Ardenna had a South Australia - New Zealand origin and, thereafter, some lineages colonised the Southern Ocean (Figure 1), which disagrees with the North Atlantic origin ofArdenna proposed by (Austin, 1996) based on the fossil record. Extant species are long-distance trans-equatorial migrants that can be locally common on North American and European coasts (Carey, Phillips, Silk, & Shaffer, 2014; Shaffer et al., 2006) and based on our ancestral state reconstruction, the MRCA of Ardenna was also most likely a long-distance migrant (Figure S6). We suggest that extinct taxa were also long-distance migrants breeding in the Southern Hemisphere, and that the fossils found in the North Atlantic likely represent birds that died during the non-breeding period.
The ancestors of Puffinus acquired the strongest diving adaptations of the three genera (Olson, & Rasmussen, 2001); these allow them to routinely dive to depths of 55 m (Shoji et al., 2016), providing advantages for reaching prey in the nutrient poor tropical and subtropical waters of the Pacific (inaccessible to most other tropical seabirds; Burger, 2001), where the MRCA of Puffinus most probably originated based on the current ancestral range estimation analyses and the fossil record (Miller, 1961). Although we could not obtain samples for P. subalarisfrom the Galapagos; in a previous study this species formed a clade withP. nativitatis (Austin et al., 2004), which further supports a Tropical Pacific origin ofPuffinus . Most extant Puffinus species are short-distance migrants or dispersers that remain close to their breeding sites throughout the year (e.g. Ramos et al., 2020). Their lower dispersal compared to other shearwater genera may have reduced gene flow and promoted higher species richness. The population sizes ofPuffinus species tend to be small and many had the highest EDGE scores (Table 2), which is a metric that identifies those threatened species that deserve particular attention because of their unique evolutionary history. Predation by invasive alien species is the main current threat for seabirds (Croxall et al., 2012) and is a principal cause of population declines among Puffinusspecies (Rodríguez et al., 2019). Enhanced by predation, intra- and inter-specific competition for nest sites plays an important role in limiting populations of small Procellariiformes, such as Puffinusshearwaters (Monteiro, Ramos & Furness 1996; Ramos, Monteiro, Sola, & Moniz, 1997). At sea, fisheries bycatch is also a main threat for Puffinus shearwaters (Bugoni, Mancini, Monteiro, Nascimento, & Neves, 2008; Cortés, Arcos, & González-Solís, 2017) and one that could drive some species to extinction unless conservation measures are put in place (Genovart et al., 2016). These are likely some of the main reasons why Puffinusshearwaters have the highest number of endangered species among the shearwaters.
Across the three genera, the Pliocene-Pleistocene boundary appeared as a period of high and rapid speciation and dispersal (Figure 1). For instance, Puffinus spread from the Pacific to the North Atlantic, the Southern Ocean, and the Indian Ocean during a rapid radiation. During the Cenozoic, the largest global sea-level changes and oscillations occurred in the Pliocene and Pleistocene (Lisiecki & Raymo, 2007; Miller et al., 2005; Zachos, Pagani, Sloan, Thomas, & Billups, 2001). Neritic waters, which represent the main foraging grounds for medium and large shearwaters, especially during the breeding period, suffered a significant sudden reduction at the end of the Pliocene followed by extreme fluctuation and gradual reduction over the Pleistocene (Pimiento et al., 2017). Global oceanographic changes, such as the end of permanent el Niño, the closure of the Isthmus of Panama and the formation of the Arctic ice cap (Fedorov et al., 2006; O’Dea et al., 2016) may have been the cause of such reduction. This reduction has been hypothesised to be the cause of a three-fold increase in the extinction rate of megafauna associated with coastal habitats (O’Dea et al., 2007; Pimiento et al., 2017). In shearwaters, ~36% of the known extinct fossil species are from the Pliocene (Howard, 1971; Olson, 1985; Olson, & Rasmussen, 2001); together with the long stems in the three shearwater genera (Figure 1), this suggests that the late Pliocene extinction severely affected the group. The subsequent burst of speciation and dispersal was probably promoted by Pleistocene climatic shifts that probably promoted geographic splitting and bottlenecks (Avise & Walker, 1998; Gómez-Díaz et al., 2006). An increase in diversification during this period has also been detected in other seabird groups such as penguins (Cole et al., 2019; Vianna et al., 2020) and even in deep sea species (Eilertsen & Malaquias, 2015).

Body mass as a key phenotypic trait

In the Procellariiformes, body mass is a trait closely related to fitness at the intraspecific level. For instance, body condition (body mass corrected by overall body size) of the progenitors affects breeding success in several species (Barbraud & Chastel, 1999; Chastel, Weimerskirch, & Jouventin, 1995; Tveraa, Sether, Aanes, & Erikstad, 1998). On the other hand, at the interspecific level, the drivers of body mass variation are poorly understood despite the high variation exhibited by the Procellariiformes (Nunn & Stanley, 1998). Our results shed some new light on potential behavioural and distributional drivers that may be affecting body mass variation in the Procellariiformes, although caution must be taken at interpreting our findings that are merely correlational.
Migratory strategy was the best evaluated predictor for mean body mass (Figure 2). This correlation is likely twofold and additive. On the one hand, migratory species tend to be larger (i.e. longer wings in migratory species; Marchetti, Price, & Richman, 1995; Minias et al., 2015) as shown by the significant correlations of all the other body size measures with migratory strategy (Figure 2a). On the other hand, migratory behaviour allows the exploitation of additional resources leading to a higher accumulation of fat deposits. The weaker but significant correlation between migratory strategy and mean body mass when corrected by body surface (Table S3) supports the twofold and additive effect of this correlation.
Within an endothermic species or a group of closely related endothermic species, individuals inhabiting colder habitats and higher latitudes tend to be larger than those inhabiting warmer environments and lower latitudes (Bergmann, 1848). This geographical pattern in body size holds for birds throughout the world at the intraspecific (Ashton, 2002; Meiri & Dayan, 2003) and interspecific level (Bergmann, 1848) although the mechanisms responsible for the generation of this trend are subject to much debate (Ashton, 2002; Meiri, 2011). In the shearwaters, this pattern has also been shown to apply to intraspecific body size variation in the Streaked Shearwater (Calonectris leucomelas ; Yamamoto et al., 2016). Among shearwater species, we also found a positive significant correlation between breeding latitude and mean body mass (Figure 2 and Figure S5C), despite previous studies that have shown that conformity to Bergmann’s Rule tends to be weaker for migratory and enclosed nesting species (Mainwaring & Street, 2019; Meiri & Dayan, 2003). The correlation was strongest between maximum breeding latitude and mean body mass corrected by body surface (R2 = 0.387; Table S3), suggesting that heavier bodies, independent of body size, might provide a better adaptation to thrive in higher and colder latitudes. However, these correlations could also be indirectly driven by a higher tendency of species living in higher latitudes to be migratory and/or by differences in diving behaviour, which could not be explored in this study.
The strong association between body mass range and latitudinal range is likely twofold. On the one hand, exploiting larger foraging areas may allow for ecological segregation between sexes and size dimorphism (De Felipe et al., 2019). Indeed, ecological segregation has been shown to be the most likely cause of size dimorphism in other Procellariiformes (González-Solís, 2004). On the other hand, larger body mass differences may arise between individuals that are more efficient at exploiting the available resources compared to those that are less efficient. This might provide the substrate for sexual selection to act on body mass. Higher body condition has been associated with higher breeding success in several species of Procellariiformes (Barbraud & Chastel, 1999; Barbraud & Weimerskirch, 2005; Chastel et al., 1995).

Considerations of shearwater taxonomy

Species delimitation in shearwaters is a challenging and controversial topic, partly due to their remarkably similar morphology (Austin et al., 2004). Conflict has arisen amongst morphological studies, and analyses based on genetic data (i.e., mtDNA and microsatellites), and also between different genetic datasets (Austin, 1996; Genovart et al., 2013; Gómez-Díaz, González-Solís, & Peinado, 2009). In addition, despite being a promising trait for species delimitation, analyses of shearwater vocalizations are limited (Bretagnolle, 1996). Genome-wide datasets have the potential to provide fine-scale population structure and genomic divergence estimates that can inform taxonomy. Despite the high resolution of our PE-ddRAD dataset, fineRADStructure analysis showed no structure between two species pairs, P. mauretanicus and P. yelkouan , and A. creatopus andA. carneipes (Figure 3). Furthermore, although we do not consider there to be a genetic cutoff for species‐level divergence, the genetic divergence between these recently diverged species were the lowest amongst any pair of species and overlapped with the genetic divergences observed between individuals of the same subspecies (Figure 4). P. mauretanicus and P. yelkouan were granted species status based on morphological, osteological and reciprocal monophyly using cytochrome b sequences (Heidrich, Amengual, & Wink, 1998; Sangster, Knox, Helbig, & Parkin, 2002). However, more recently, a lack of correspondence at the individual level was found between phenotypic characters, stable isotopes analyses, nuclear and mtDNA, and was attributed to admixture between the two species (Genovart, Juste, Contreras-Díaz, & Oro, 2012; Militão, Gómez-Díaz, Kaliontzopoulou, & González-Solís, 2014). A. creatopus andA. carneipes are widely considered as two different species in taxonomic checklists (Carboneras, & Bonan, 2019; Gill et al., 2020), but some authors have argued that they should be considered conspecific based on the lack of uniform differentiation in colour and size (Bourne, 1962) and on low mtDNA differentiation (Penhallurick & Wink, 2004). These species pairs differ in plumage colouration and body size, which are known to be labile traits even within species of shearwaters. Dark and pale phases can be found within a single species (i.e.,A. pacifica ) and some species exhibit a continuum from pale to dark (i.e., P. mauretanicus ). Body size covaries with migratory behaviour (see previous section), can be under selection (Barbraud, 2000; Navarro, Kaliontzopoulou, & González-Solís, 2009), and thus could evolve rapidly under strong selection pressures. In addition to the aforementioned species pairs, other shearwater species showed weak patterns of population structure and genetic distances within the interval among different subspecies: P. boydi and P. baroli , and the three Atlantic Calonectris species. These species complexes are the subject of ongoing taxonomic debate (Genovart et al., 2013; Gómez-Díaz et al., 2009; Olson, 2010; Sangster, Collinson, Helbig, Knox, & Parkin, 2005; Ramos et al., 2020). As a final consideration, our genomic data, together with ongoing taxonomic debate, suggest that these taxa should not be granted species status. Future studies should use species delimitation approaches combining genomic data with a thorough morphological revaluation including a detailed evaluation of vocalisations. Further research is also required to include the taxa that could not be sampled during this study, particularly taxa from the tropical Pacific that breed in remote islands and have very limited distributions and low population sizes.