2.4 | Statistical Methods
Sequence fragments were assembled, edited, and aligned with MUSCLE (Edgar 2004) within Geneious Prime. The COI alignment was translated with the Mold, Protozoan, and Coelenterate translation mitochondrial code table to ensure that the sequences were aligned in the correct reading frame and that no stop codons were present.
We used jModelTest (Posada & Crandall 1998) within Geneious to select the best evolutionary model. We estimated phylogenies with MrBayes (v3.2, Ronquist et al. 2012) and IQ-Tree 2 with 1000 bootstrap replicates (Minh et al. 2020) for all ctenophores available based on the most appropriate models selected by the AIC (Akaike 1974). Bayesian phylogenies estimated with MrBayes included multiple runs of 5x106 generations with a 10% burn-in, with six chains, that were sampled and printed every 1000 generations. Convergence was determined with TRACER v1.7 (Rambaut et al. 2018) and by comparing topologies of multiple runs.
Phylogenies were visualized with FigTree (v1.4.4, tree.bio.ed.ac.uk/software/figtree/). To illuminate within-order diversity, alignments of lobate species for both COI and 18S fragments were analyzed separately with the same parameters that had been used for the full alignment, and both Bayesian and likelihood support values were reported on the phylogeny. The lobate phylogenies were rooted with the cydippid Pleurobrachia bachei as the outgroup.
To assess diversity within sequenced ctenophores for COI we calculated percent general time reversible (GTR) distance within and between molecular operational taxonomic units (MOTUs). Base composition was calculated with MegaX (v10.0.5, Kumar et al. 2018) and pairwise distances were calculated in Geneious Prime. We calculated the number of parsimony-informative sites within the Lobata for bothCOI and 18S with DnaSP (v6.12.01, Rozas et al.2017). We tested for saturation of observed proportions of transitions and transversions versus GTR distance among all MOTUs with the software DAMBE7 (v7, Xia 2018). We also tested five other mitochondrial loci for saturation from the published mitochondrial genomes of eight species of ctenophores with DAMBE7.
To illustrate how a more complete reference library can affect metabarcoding sequence assignments, we used data for the COIfragment for ctenophores from Pitz et al. (2020) and queried our ctenophore-specific library with the same methods as the authors of the study. Data for stations where no ctenophore sequences were recovered were not reported. We plotted species assignments based on the presence or absence of reads in each sample in Rstudio (Team 2015) with ggplot2 (Wickham 2016).