Protein coding low-copy rpb2 and ef1-α regions are viable fungal
metabarcoding DNA markers which can supplement ITS for better accuracy
Abstract
The nuclear ribosomal DNA Internal Transcribed Spacer (ITS) region is
used as universal fungal barcode marker, but it is often missing
significant DNA barcoding gap between sister taxa. Here we tested
reliability of protein coding low-copy genes as alternative markers.
Mock communities of three unrelated agaric genera (Dermoloma, Hodophilus
and Russula) representing lineages of closely related species were
sequenced by Illumina platform targeting ITS1, ITS2, the second largest
subunit of RNA polymerase II gene (rpb2) and the transcription
elongation factor 1-alpha gene (ef1-α) regions. The representation of
species and their relative abundances were similar in all tested barcode
regions, despite lower copy number in protein coding markers. ITS1 and
ITS2 required more sophisticated sequence filtering because they
produced a high number of chimeric sequences requiring reference-based
chimera removal and had higher number of sequence variants per species.
Clustering of filtered ITS sequences showed in average higher number of
correctly clustered units at best fitted similarity thresholds, but
these thresholds were very different among genera. Best fitted
thresholds of low-copy markers were more consistent among genera but
species resolution was frequently missing due to low intraspecific
variability. At some thresholds we observed multiple species lumped
together and, at the same time, species split in multiple partial
clusters, which should be taken into consideration when assessing best
clustering thresholds and taxonomic identity of clusters. For best
taxonomic resolution and better species detection, we recommend to
combine different markers and to apply additional reference-based
sorting of clusters.