2.3 Data analysis
Data were evaluated in MaxQuant version 2.2.0.0 [17], and the Andromeda search engine was used [18]. The key parameters for evaluation of label-free data were as follows: a false discovery rate (FDR) of 0.01 for proteins and peptides; digestion: trypsin/P; variable modifications: methionine oxidation and N-terminal protein acetylation. The fixed modification for the dataset MSV000089235 was MethylThio. The data were searched against the wide constructed database (section 2.2.). Then, the normalized label-free quantification (LFQ) intensities were evaluated in Perseus version 2.0.7.0 [19]. The data were processed to include indicators of the database components identifying the protein hits. Hits evaluated as contaminants, reverse hits and hits identified only by site were discarded. In addition, only hits containing at least one reported quantitative value were analyzed in detail.
For identification of the protein hits by different components of the wide database, we deployed an UpSet plot [20]. The data were analyzed in the R program using the UpSetR package, version 1.4.0 [21]. Furthermore, individual differences among the identifications dependent on the database component were evaluated, and key protein sequences were selected and further analyzed. Selected sequences were compared in Clustal Omega [22] and further visualized in Jalview, version 2.11.2.6 [23]. In addition, InterPro was used for functional analysis [24, 25].
The markers observed after the secondary analysis of the MS/MS data of MSV000089235 were also reinspected in previously published paper [12] to determine whether some novel markers of interest that were not selected and discussed were present in the raw result reports. Furthermore, the dataset MSV000083636 was reanalyzed similarly to the first dataset (see above), but carbamidomethyl was used as the fixed modification, and only selected sequences were added to the search. Because the very wide database of P. larvae is not appropriate for data analysis, only specific searches were performed together withA. mellifera sequences. All the analyses were targeted to the key markers that could differ based on the virulence and/or the level of infection.
3 Results