Proteomic differentiation
In total, 96 H. bicuspis specimens were used in the MALDI-TOF MS analysis. Based on the COI sequences, 56 specimens were assigned toH. bicuspis I, 7 to H. bicuspis II, 11 to H. bicuspis III, 9 to H. bicuspis IV and 13 to H. bicuspisV. Intra-specific Euclidean distances (pooling all lineages for H. bicuspis ) ranged from 0.56 (10% quantile) to 0.83 (90% quantile), while inter-congener distances ranged from 0.93 (10% quantile) to 1.05 (90% quantile) with very little overlap. Lineage-specific distances were all in the same range with no distinct differences, and a small tendency was found with a higher inter-lineage distance of I-III vs IV than the intra-lineage distances of I-III (Figure 3). The Robust Distance-based Multivariate Analysis of Variance (Wd*) (Hamidi et al. 2018) revealed significant differences between lineages I-III, IV, and V (p <0.001). The principal component analysis (PCA) (Figure 3) of the processed data shows two major groups. One comprises the specimens belonging to H. biscuspis group I-III (Figure 1, blue) from the North of Iceland, and the other includes specimens from sampling sites South of Iceland (Figure 1, beige and green). Based on classification votes, the TSNE plot constrained to predefined groups in a RF classification model (Figures 6-10) supports a differentiation of these regional groups based on proteome data. In a classification approach, 98.6% of group H. bicuspis I-III would be identified correctly with one specimen (ZMH K-58552) being classified as H. bicuspis IV. Similarly, one specimen of H. bicuspis IV (ZMH K-58494) would be classified as H. bicuspis I-III (11.1%), and three as H. bicuspis V (33.3%) (ZMH K-58496, ZMH K-58577 and ZMH K-58579). Of H. bicuspis V, 92.3% would be classified correctly with one specimen (ZMH K-58521) being assigned to IV. By investigating the most important variables within the RF model based on the highest decrease in Gini Index, peaks were identified that show group specific behavior.
Haploniscus bicuspis IV and V show shifts (in the range of 40-50 Daltons) in larger proteins compared to H. bicuspis I-III, whileH. bicuspis V and IV were mainly separated by the relative expression of proteins with m/z of 2400, 4407 and 2680 (Figure 4). Whereas the mere presence or absence of these peaks would probably not be sufficient to distinguish between groups, relative peak intensities differ consistently between groups.