Vasilii Shapkin

and 7 more

The nuclear ribosomal DNA Internal Transcribed Spacer (ITS) region is used as universal fungal barcode marker, but it is often missing significant DNA barcoding gap between sister taxa. Here we tested reliability of protein coding low-copy genes as alternative markers. Mock communities of three unrelated agaric genera (Dermoloma, Hodophilus and Russula) representing lineages of closely related species were sequenced by Illumina platform targeting ITS1, ITS2, the second largest subunit of RNA polymerase II gene (rpb2) and the transcription elongation factor 1-alpha gene (ef1-α) regions. The representation of species and their relative abundances were similar in all tested barcode regions, despite lower copy number in protein coding markers. ITS1 and ITS2 required more sophisticated sequence filtering because they produced a high number of chimeric sequences requiring reference-based chimera removal and had higher number of sequence variants per species. Clustering of filtered ITS sequences showed in average higher number of correctly clustered units at best fitted similarity thresholds, but these thresholds were very different among genera. Best fitted thresholds of low-copy markers were more consistent among genera but species resolution was frequently missing due to low intraspecific variability. At some thresholds we observed multiple species lumped together and, at the same time, species split in multiple partial clusters, which should be taken into consideration when assessing best clustering thresholds and taxonomic identity of clusters. For best taxonomic resolution and better species detection, we recommend to combine different markers and to apply additional reference-based sorting of clusters.