4.2 Sequencing artifacts due to mtDNA duplication and
uncertainties about molecular clock rates
The mitochondrial genome of several Procellariiformes presents tandem
repeats (Abbott et al. 2005; Lounsberry et al. 2015), including P.
lherminieri (Torres et al. 2018). In this taxon, a duplicated region
comprising two copies of the CR was found, yielding the presence of
double peaks on chromatograms. By treating gDNA with an exonuclease
(which effectively removed all linear DNA), we did not observe triple-
or quadruple-peaks on chromatograms, and thus hypothesized that only two
copies of the CR sequences were amplified by PCR. Therefore we
considered two haplotypic phases for all CR sequences, assuming that
removing all CR sequences would have led to a loss of information and of
statistical power in the analyses. To check the robustness of this
approach however, we replicated all our analyses using two other data
subsets: one in which all ambiguous CR sequences were removed, and
another one from which all CR sequences were removed. Removing all CR
sequences led to a strong loss of information, an increase of the
estimated differentiation and a decrease of the estimated divergence
times (Supplementary Material 6). Removing only the ambiguous sequences
led to estimations close to the estimations of the complete dataset.
This suggests that the noise caused by the multiple copies of CR was
swamped by the signal contained in that marker and so our analyses using
all the individuals are not significantly biased by these sequences.
Another major issue concerns the choice of a molecular clock rate for
dating the splitting events. We found a clear hierarchical structuration
of populations in the Atlantic and Indo-Pacific Oceans, which diverged
around 1.76 to 2.71 My ago either using all markers or only mtDNA
markers, respectively. In contrast, using the same taxa (but fewer
specimens and only cytb ), Austin et al. (2004) dated this split
at 3.2-3.8 My, and rather suggested that the closure of the Isthmus of
Panama erected a barrier to gene flow between Indo-Pacific and Atlantic
populations. This difference in divergence times might be due to taxon
sampling, gene sampling, or the calibration of the molecular clock
(0.631%/My here, vs. 0.9%/My in Austin et al). This latter parameter
is probably a major contributor to the difference observed between our
studies, since using the same value as Austin et al, and only sequences
from cytb , we found a divergence time of 2.2-3.8 My, thus very
close to Austin et al. Many substitution rates for both Mt DNA markers
were proposed for petrels, ranging from 0.19%/My (Pacheco et al. 2011)
to 1.544%/My (Pereira & Baker 2006) for cox1 , and from
0.18%/My (Pacheco et al. 2011) to 0.88-0.92%/My (Nunn & Stanley 1998;
the latter rate was used by Austin et al), 1.022%/My (Pereira & Baker
2006) and 1.89%/My (Weir & Schluter 2008) for cytb . Using
lowest rates (Pacheco et al. 2011) resulted in a divergence time between
Atlantic and Indo-Pacific lineages around 12.8 My ago (Supplementary
Material 10), an unlikely value given the presence of an ocean between
the Americas before the Isthmus of Panama erected, despite the fact thatPuffinus is dating back to the Oligocene (Henderson & Gill
2010). Here we used the substitution rates that were estimated in
Nabholz et al. 2016, as the most recent review. There are two further
arguments against Austin et al. (2004) scenario: first, the closure of
the Isthmus of Panama has been actually dated at 2.8 My ago (O’Dea et
al. 2016), thus later than Austin et al’s scenario. Second, we found
that the pacific taxon, dichrous , was not ancestral to the
Atlantic taxon, but was embedded within Indian ocean taxa. The use of
precise fossil calibration would bring a more robust estimation of
divergence times and so a supplemental evidence of the origin of the
colonization.