3.1 Species delimitation
The complete data set consisted of 1,492 barcodes, ranging from 312 to
658 bp in length. In total, there were 319 variable sites (48.5%), of
which 299 (93.7%) were parsimony informative. Most parsimony
informative sites occurred in the third codon-position (Table 1). The
sequences were heavily AT-biased, specifically in the third position,
which exhibited a combined average AT-composition of 89.3% (Table 1).
Average intraspecific and interspecific K2P-distances for all analyzedPolypedilum species were 1.3% and 15.2%, respectively. The
barcode gap is an important concept in barcoding studies (Puillandre et
al., 2012). It works well when the amount of intraspecific divergence is
much smaller than the amount of interspecific variation between species.
When this condition is met, a ‘barcoding gap’ exists (Meyer & Paulay,
2005). In general, our data showed clearly larger interspecific than
intraspecific divergences, but we still could not observe the expected
‘barcoding gap’ in the pairwise K2P distances. On the contrary, a
barcode overlap between the intraspecific and the interspecific
distances was found, which may be attributable to the presence of
cryptic species diversity and a few misidentifications. The lack of a
gap is usually associated with recently diverged species with little
genetic diversification, frequently coupled with incomplete lineage
sorting and introgression (Wiemers & Fiedler, 2007; Dupuis et al.,
2012).
Overall, most of the tested methods recovered similar groupings of
molecular operational taxonomic units (MOTUs) (Figures 1-4), with the
mPTP method being the most conservative, lumping the sequences into
fewer MOTUs, and the bPTP algorithm the most relaxed, lumping the
sequences into several MOTUs (Table 2). Two out of the three
distance-based methods, ABGD and ASAP, yield unreliable delimitations
with wide confidence intervals, with several clusters not reflecting
relationships as understood based on the geographical sampling
localities and others diverging into numerous lineages despite
diminished divergence between them. ABGD and ASAP results were not
included in the Figures 1-4. The BIN analysis returned a total of 415
MOTUs of which 174 were singleton BINs, 222 concordant BINs, and 19
discordant BINs. In total, 615 sequences of 143 morphospecies were
assigned to 179 BINs, including 72 singleton BINs, 519 concordant BINs,
and 24 discordant BINs. The unidentified 877 specimens, without binomial
names, were assigned to 236 BIN-species, including 102 singleton BINs,
118 concordant BINs, and 16 discordant BINs.
DNA-based species delimitation applying bPTP, mPTP, sPTP, and sGMYC
resulted in divergent number of clusters. The single-threshold general
mixed Yule-coalescent calculations (sGMYC) recovered 370 MOTUs, while
the sPTP model produced a more conservative number of MOTUs (411)
compared to the bPTP method, which yielded 520 MOTUs (Table 2). The
results from analyses using the multi-rate PTP (mPTP) model were also
comparable to those of the other models, but revealed larger clusters,
occasionally joining lineages belonging to different species in a single
MOTU (Figure 1). Divergences in the number of clusters generated by the
different species delimitation algorithms are caused by erroneously
inferred splitting or lumping events (i.e., specimens of one
morphospecies were divided or joined into two or more different MOTUs).
However, regardless of the method applied, the total number of species
delimited in Polypedilum in this study is at least twice as high
(267–520) as the number of included morphospecies (143, see above).