Essential Site Maintenance: Authorea-powered sites will be updated circa 15:00-17:00 Eastern on Tuesday 5 November.
There should be no interruption to normal services, but please contact us at [email protected] in case you face any issues.

loading page

Protein coding low-copy rpb2 and ef1-α regions are viable fungal metabarcoding DNA markers which can supplement ITS for better accuracy
  • +5
  • Vasilii Shapkin,
  • Miroslav Caboň,
  • Miroslav Kolarik,
  • Katarína Adamčíková,
  • Petr Baldrian,
  • Tereza Konvalinková,
  • Tomáš Větrovský,
  • Slavomir Adamcik
Vasilii Shapkin
Plant Science and Biodiversity Centre Slovak Academy of Sciences
Author Profile
Miroslav Caboň
Plant Science and Biodiversity Centre Slovak Academy of Sciences
Author Profile
Miroslav Kolarik
Institute of Microbiology Czech Academy of Sciences
Author Profile
Katarína Adamčíková
Slovak Academy of Sciences
Author Profile
Petr Baldrian
Institute of Microbiology of the ASCR
Author Profile
Tereza Konvalinková
Institute of Microbiology, Czech Academy of Sciences
Author Profile
Tomáš Větrovský
Institute of Microbiology of the ASCR
Author Profile
Slavomir Adamcik
Plant Science and Biodiversity Centre Slovak Academy of Sciences

Corresponding Author:[email protected]

Author Profile

Abstract

The nuclear ribosomal DNA Internal Transcribed Spacer (ITS) region is used as universal fungal barcode marker, but it is often missing significant DNA barcoding gap between sister taxa. Here we tested reliability of protein coding low-copy genes as alternative markers. Mock communities of three unrelated agaric genera (Dermoloma, Hodophilus and Russula) representing lineages of closely related species were sequenced by Illumina platform targeting ITS1, ITS2, the second largest subunit of RNA polymerase II gene (rpb2) and the transcription elongation factor 1-alpha gene (ef1-α) regions. The representation of species and their relative abundances were similar in all tested barcode regions, despite lower copy number in protein coding markers. ITS1 and ITS2 required more sophisticated sequence filtering because they produced a high number of chimeric sequences requiring reference-based chimera removal and had higher number of sequence variants per species. Clustering of filtered ITS sequences showed in average higher number of correctly clustered units at best fitted similarity thresholds, but these thresholds were very different among genera. Best fitted thresholds of low-copy markers were more consistent among genera but species resolution was frequently missing due to low intraspecific variability. At some thresholds we observed multiple species lumped together and, at the same time, species split in multiple partial clusters, which should be taken into consideration when assessing best clustering thresholds and taxonomic identity of clusters. For best taxonomic resolution and better species detection, we recommend to combine different markers and to apply additional reference-based sorting of clusters.