4. Discussion
DNA metabarcoding has become an essential tool for species inventory and monitoring. However, its use in identifying ichthyoplankton is still incipient in the Neotropics, with several methodological challenges and biases still needing to be tackled (Carvalho, 2023). Considering the crescent demand for innovative techniques to unravel the complex reproductive dynamics of fish communities for both research and practical applications, there is an underlying need for the continuous refinement of this methodology. The COI gene has been commonly used as the marker of choice (Mariac et al., 2018; Nobile et al., 2019) because of its well-established primers and complete reference sequences libraries encouraged by the global initiative Fish Barcode of Life. Here, each molecular marker recovers a distinct community structure when considering both quantitative and qualitative analysis.
Using a marker of choice still raises concerns since using several markers is still expensive when using HTS and because each marker has distinct amplification biases and taxonomic resolution (Deagle et al., 2014). The high interspecific variability of the COI gene, when compared to other mitochondrial genes (Hebert et al., 2003), can help differentiate closely related species. However, the same high variability creates the need to use universal degenerate primers with lower specificity than those designed for more conserved genes. Also, it hinders the design of internal minibarcodes for COI (Deagle et al., 2014). In the present study, while COI presented 93.33% sequencing success and some of its sequences remained unassigned or were assigned to Bacteria, both 12S markers were successfully sequenced for all samples, and all their sequences were assigned to fish taxa. Additionally, the technology used for sequencing limits the total fragment size to 600bp, precluding the merging of both COI strands from forming the full-sized barcode, therefore, each strand was analyzed independently. The loss of resolution power caused by this could explain why COI detected fewer species, genera, and families than the 12S markers and why two of the three exclusive species level identifications were assigned to nonnative fishes closely related to species from São Francisco.
Minibarcode markers for the 12S gene have been developed and applied to environmental DNA metabarcoding studies (Milan et al., 2020; Miya et al., 2020; Sales et al., 2021) and, more recently, to ichthyoplankton studies as well (Jiang et al., 2022; Van Nynatten et al., 2023). One of the main concerns when using these markers is the conserved nature of the gene, which can impact their ability to differentiate closely related species, especially in diverse regions. However, the current study shows that both MiFish and NeoFish were able to successfully identify and distinguish multiple congeneric species, such asLeporinus piau and L . taeniatus ,Megaleporinus elongatus and M . reinhardti ,Pimelodus fur , P . maculatus and P .pohli , and Prochilodus argenteus and P .costatus . Moreover, the 12S markers have a higher species detection sensibility than COI, considering that the exclusive fishes they retrieved were underrepresented, with low RRA. This could result from low-efficiency primer binding by COI, which can lead to a lack of amplification (Zhang et al., 2020).
Database completeness is another variable that directly impacts species detection, as a lack of reference sequences for a given species may hamper accurate taxonomic assignment (Collins et al., 2019). This aspect has affected both COI and 12S markers in this study. For instance, whilePachyurus squamipennis is not represented by any COI reference sequence in the public databases and was exclusively detected by 12S, the only native species retrieved solely by COI, Bergiaria westermanni , does not have any 12S representative sequence in neither the public nor our custom library. These limitations highlight the importance of continuous sequencing efforts to broaden reference sequence databases, especially for megadiverse regions.
Considering that each marker has advantages and limitations, some studies suggest combining multiple primer sets to increase taxonomic coverage (Liu and Zhang 2021; Zhang et al., 2020). In a metabarcoding study using multiplexed markers to identify zooplankton mock communities, Zhang et al. (2018) demonstrated that a multi-maker approach can improve species detection and allow the cross-validation of taxa detected by each marker. Our results support this conclusion, as using the three markers combined increased the genera detection by up to 87.5% and species detection by up to 61.54%. Therefore, employing multiple markers, whenever feasible, reduces the likelihood of overlooking species or incorrectly classifying them due to the absence or mislabeling of sequences in the reference database (Locatelli et al., 2020).
Discrepancies between markers were observed in the quantitative analysis using the RRA estimative. Although some studies with mock samples of eggs (Duke and Burton, 2020) and larvae (Nobile et al., 2019) yielded a positive correlation between input organisms and output reads for each species, the results from this study support the idea that amplification bias is one of the main pitfalls for quantitative metabarcoding analyses, as already reported (Carvalho, 2023; Fonseca, 2018). While MiFish and COI presented RRA resemblance for samples in which both detected similar taxa, low-efficiency primer binding to Siluriformes and especially Pimelodidae sequences resulted in completely different abundance patterns for NeoFish.
In conclusion, using multiple markers from two distinct genes and lengths allowed an increased taxonomic coverage and robust taxonomic classification of complex neotropical ichthyoplankton communities. Finally, precaution is still needed when inferring species abundance based on DNA metabarcoding data when using PCR-dependent protocols since it is marker dependent. Nonetheless, ichthyoplankton metabarcoding offers superior resolution and feasible scalability compared to traditional techniques, and provides qualitative information, which is paramount for characterizing reproducing species and definition of conservation strategies.