3.4 Selected markers in in vivo (prepupa) data
We verified that the large database (D1–D18) was not useful for screening complex data if the pathogen was present in the host. The markers were searched against the original data published previously with a UniProt-derived database in [12] that relates to the raw dataset MSV000089235. The two GHL10‒FN3/alpha-galactosidase isoforms were not differentiated in the analysis, and despite sequence differences (amino acid substitutions), they were identified as one hit (e.g., UniProt: W2E369 vs. A0A2L1TZ12). Incidentally, the fasta header in the original results was Fibronectin type-III domain-containing protein or alpha-galactosidase. Furthermore, the two collagenase isoforms were not differentiated, and despite potential sequence differences (amino acid substitutions, e.g., UniProt: A0A2L1UJN7 vs. A0A1V0UTD3), they were present as one hit. Note that there was an absence of collagenase identifications in the most affected “lysed” samples. A new separate search with the sequence of id: 475 (CP020557.1:False:1915540) was not successful at identifying it in theMSV000089235 raw data. Notably, DUF3221 (id: 438) could not be identified using UniProt access no., but similar sequences (see above; e.g. id: 232 - V9W786/W2E226) and others (id: 311: A0A1V0UTM0) were identified; however, id: 331 (A0A1V0UVZ6; W2E2L8; A0A2L1UAH0) was not identified. In addition, a DUF3862 domain-containing protein (id: 66) (UniProt: A0A2L1UHM8; A0A1U9YLD2; A0A2A5JXM) was not identified in thein vivo dataset. Incidentally, both id: 331 and id: 66 were not identified as being expressed in only ERIC II. The CRISPR-related proteins (section 3.3.3) were present in the in vivo dataset at only trace levels, with no quantitative value, and therefore could not be considered in evaluations. Importantly, InhA was absent, consistent with an earlier report [12]. Finally, ABC transporters related to iron–siderophore uptake were identified in in vivo . Id 355 (W2E607; A0A1V0UYN2; A0A2A5JXJ9; V9W3E2; A0A2L1U0G6) was not found inin vivo data. Id: 156 had low abundance and therefore was only trace and was not reported among LFQ positive data (W2ED21; V9WC31; A0A6C0QLW6; A0A2L1U9J1; A0A2A5JZ82; A0A1U9YPW6). Importantly, id: 261 (V9W5A1; W2E8R6; A0A2A5K2P5; A0A1V0UNQ7; A0A2L1UCR7; A0A1U9YRI3) was present in 11/12 P. larvae PCR-positive samples, and the highest relative abundance was in the “lysed” samples. Two proteins participating on “bacillibactin” synthesis could be identified, dhbA – V9W311 was only trace, while dhbC – V9W8G5 was identified in three samples, but not in “lysed” larvae.