Keywords: Molecular taxonomy, COI, 12S rRNA, High-throughput DNA sequencing 1. Introduction
The study of ichthyoplankton composition, abundance and distribution is pivotal for understanding the reproductive dynamics of local fish assemblages (Mariac et al., 2018). The analysis of these parameters allows the identification of spawning sites, nursery areas where recruitment occurs, migration routes, temporal and spatial pattern variations and differences in the reproduction patterns of migratory and nonmigratory fish (Baumgartner et al., 2004; Bialetzki et al., 2005; Reynalte-Tataje et al., 2012). This information is instrumental in elucidating the influence of anthropogenic environmental alterations on fish reproduction and in the definition of effective management actions for species conservation and, consequently, fishing stock maintenance (da Silva et al., 2015; Silva et al., 2017).
Traditionally, ichthyoplankton taxonomy has applied the regressive development sequence technique, based on the morphological comparison of younger larvae with previously identified juveniles (Ahlstrom and Moser, 1976; Nakatani et al., 2001). However, due to the absence of morphological diagnostic characters during the egg stage, some authors exclude them from the studies and resort to identifying exclusively larvae, which in the initial stages is also a difficult task (Baumgartner et al., 2008; Reynalte-Tataje et al., 2012). Moreover, the accuracy of the traditional morphological identification can diverge between taxonomists and laboratories, according to their experience and specialty (Ko et al., 2013). These limitations can compromise surveying essential information to conserve the areas of interest (Nobile et al., 2019).
Studies have employed molecular techniques to strengthen the precision and reliability of ichthyoplankton taxonomy. Comparative investigations have demonstrated that molecular taxonomy using DNA barcoding is more efficient than traditional morphological taxonomy, identifying the eggs and larvae to lower taxonomic levels and correcting erroneous morphological identifications (Becker et al., 2015; Ko et al., 2013). Using DNA barcoding, (Frantine-Silva et al., 2015) identified over 99% of 536 ichthyoplankton samples at species levels, including eggs, which accounted for 30% of the observed species richness. Morphologically, (Becker et al., 2015) identified eggs only as migratory or nonmigratory, when possible, while DNA barcoding allowed the identification of eggs (plus damaged larvae) to species level, and highlighted imprecisions in the morphological taxonomy even with such broader analysis. Nonetheless, despite its great taxonomic precision, DNA barcoding relies on individual processing and sequencing of each organism, and can become expensive and laborious for large scale inventories (Taberlet et al., 2012; Yu et al., 2012), such as ichthyoplankton studies (Mariac et al., 2018; Nobile et al., 2019).
The DNA metabarcoding approach, using High-Throughput Sequencing (HTS), has gained prominence for its ability to allow massive biodiversity access and transform ecology (Yu et al., 2012). The method combines DNA barcode-based taxonomy with HTS to simultaneously identify hundreds to thousands of organisms. DNA metabarcoding analyses are economical, quick, broad, minimally dependent on taxonomic expertise, and its data remain available for further verification (Taberlet et al., 2012; Yu et al., 2012). This approach has allowed the reconstruction of ancestral communities (Jørgensen et al., 2012), biodiversity monitoring (Andersen et al., 2012), and detection of larger operational taxonomic units in a fraction of the time spent in conventional studies based on morphology and DNA barcoding (Fonseca et al., 2010). This approach has also shown high efficiency in ecological ichthyoplankton studies, allowing precise and reliable identification of fish egg and larva bulk samples (Kimmerling et al., 2018; Mariac et al., 2018).
Different from environmental samples (for example, soil and water), in which genetic material is often degraded, bulk samples usually provide genomic DNA of better quality, allowing the amplification of markers with more extensive sequences (Taberlet et al., 2012). However, the HTS platforms accessible to most research laboratories have limited sequencing lengths of up to 600 base pairs (bp). This hampers the usage of markers previously standardized for DNA barcoding, such as the 650 bp fragment of the mitochondrial cytochrome c oxidase subunit I (COI) gene commonly used for fish (Ward, 2009). Additionally, the variability in COI sequences hinders the design of internal minibarcode primers, taking some researchers to pass this gene over in favor of more conserved ones for metabarcoding (Deagle et al., 2014). Among these conserved genes, mitochondrial 12S rRNA has been highlighted as a good alternative for fish metabarcoding (Milan et al., 2020; Miya et al., 2020; Sales et al., 2021).
Besides marker selection, another challenge in DNA metabarcoding is quantitative analysis. Some factors can bias the number of read copies obtained for each individual or species, such as the number of mitochondria per cell, different-sized individuals in the same sample, and amplification bias (Carvalho, 2022; Fonseca, 2018). Nonetheless, some studies have shown a positive correlation between the number of eggs or larvae in mock samples and the number of reads obtained for each taxon using DNA metabarcoding with an amplification step (Duke and Burton, 2021; Nobile et al., 2019).
This study used DNA metabarcoding to analyze the composition of ichthyoplankton sampled at the Neotropical megadiverse São Francisco River Basin, in Brazil. Additionally, the sensibility, specificity, and taxonomic resolution of two 12S markers were tested and compared with the traditional COI fragment used for DNA barcoding. The results obtained here will contribute to an improved method for ecological studies focusing on the ichthyofauna reproductive dynamics, and to design management and conservation strategies for the maintenance of fish reproduction locally.