Dona Kireta

and 6 more

Efforts to explore optimal molecular methods for identifying plant mixtures, particularly pollen, are increasing. Pollen identification (ID) and quantification is important in many fields, including pollination ecology and agricultural sciences, but quantifying mixture proportions remains challenging. Traditional pollen ID using microscopy is time-consuming, requires expertise, and has limited accuracy and throughput. Molecular barcoding approaches being explored offer improved accuracy and throughput. The common approach, amplicon sequencing, employs PCR amplification to isolate DNA barcodes, but introduces significant bias, impairing downstream quantification. We apply a novel molecular hybridisation capture approach to artificial pollen mixtures, to improve upon current taxon ID and quantification methods. The method randomly fragments DNA, and uses RNA baits to capture DNA barcodes, which allows for PCR duplicate removal, reducing downstream quantification bias. Metabarcoding was tested using two reference libraries constructed from publicly available sequences; the matK plastid barcode, and RefSeq complete chloroplast references. Single barcode-based taxon ID did not consistently resolve to species or genus level. The RefSeq chloroplast database performed better qualitatively but had limited taxon coverage (relative to species used here) and introduced ID issues. At family level, both databases yielded comparable qualitative results, but the RefSeq database performed better quantitatively. A restricted matK database containing only mixture species yielded sequence proportions highly correlated with input pollen proportions, demonstrating that hybridization capture usefulness for metabarcoding and quantifying pollen mixtures. The choice of reference database remains one of the most important factors affecting qualitative and quantitative accuracy.

Nicole Foster

and 7 more

Metabarcoding has improved the way we understand plants within our environment, from their ecology and conservation to invasive species management. The notion of identifying plant taxa within environmental samples relies on the ability to match unknown sequences to known reference libraries. Without comprehensive reference databases, species can go undetected or be incorrectly assigned, leading to false positive and negative detections. To improve our ability to generate reference sequence databases we developed a targeted capture approach using the OZBaits_CP V1.0 set, designed to capture chloroplast gene regions across the entirety of flowering plant diversity. We focused on generating a reference database for coastal temperate plant species given the lack of reference sequences for these taxa. Our approach was successful across all specimens with a target gene recovery rate of 92% which was achieved in a single assay (i.e., samples were pooled), thus making this approach much faster and more efficient than standard barcoding. Further testing of this database highlighted 80% of all samples could be discriminated to family level across all gene regions with some genes achieving greater resolution than others – which was also dependant on the taxon of interest. Thus, we demonstrate the importance of generating reference sequences across multiple chloroplast gene regions as no single loci is sufficient to discriminate across all plant groups. The targeted capture approach outlined in this study provides a way forward to achieve this.

Nicole Foster

and 7 more