Phylogenetic relationships
OrthoFinder analysis estimated 6,172 single copy (1:1) ortholog genes across the 12 genomes surveyed. With this data we generated three supermatrices: 1) CDS supermatrix of 10,534,506 bp long to extract the 4D sites, 2) 4D supermatrix with 1,512,677 4-fold degenerate sites, and 3) the amino acid supermatrix including 3,466,564 sites. Phylogenetic analyses using the 4D and the amino acid supermatrices recovered the same topology with full support at all nodes (ultrafast bootstrap = 100; Figures S2 and S3).
The analysis performed to explicitly account for incomplete lineage sorting (ILS) with ASTRAL using either the individual gene sequences (CDS gene trees) or the individual amino acid sequences (amino acid gene trees), produced species trees with the same topology as those obtained by ML using 4D or amino acid supermatrices (Figures S4 and Figure S5). The normalized quartet score (proportion of input gene tree quartet trees in agreement with the species tree) was 0.78 for CDS gene trees and 0.64 for amino acid gene trees.
The ultrametric tree (Figure 3) obtained using r8s from the 4D supermatrix ML tree summarizes the recovered topology. In this topology, the Atlantic yellow-nosed albatross (T. chlororhynchos , Diomedeidae) is the sister group to all the other Procellariiformes. We also find that storm petrels (Hydrobatidae and Oceanitidae) do not constitute a monophyletic group. In addition, diving petrels (Pelecanoides ) are included within Procellariidae.