Protein functional network analysis of DIA data
STRING enrichment analysis was performed using abundance data for 2114
detectable proteins out of 2120 proteins included in the DIA assay
library. to identify categories that are enriched and depleted following
acclimation of fish to BW (Table S1). In addition, a smaller subset of
proteins was used for more in-depth network analysis, which included all
proteins from clusters 1 and 6, as well as the remaining 5 significantly
regulated proteins which were not in clusters 1 or 6. Additionally, all
significantly regulated mRNAs were also included by using their
corresponding protein accession numbers (ACs) (Figure 5). In total, 277
unique protein ACs were queried with STRING for further network and
pathway analysis.
From this list, 174 protein ACs from the DIA assay library and 1 protein
AC with a significantly regulated mRNA were mapped to corresponding
STRING IDs while the remainder had no STRING ID. The resulting list of
175 STRING IDs was enriched in kidneys of BW fish with
FDR<0.05 for “mitochondrion”, “monooxygenase”,
“oxidoreductase”, “NADP”, “Microsome”, “GTP-binding”,
“glycolysis”, and “FAD” (Table 1). Protein domain enrichment in
kidneys of BW fish was most significant for (FDR<0.002) for
Aldo/keto reductase, short chain dehydrogenase/reductase SDR, flavin
binding monooxygenase, and pyridine nucleotide-disulphide
oxidoreductase.
Only using the small set of significantly regulated proteins for STRING
analysis yielded many fewer enriched categories than using the complete
set of 175 STRING IDs or using sets consisting of all proteins in
clusters 1 (down-regulated) and 6 (up-regulated) (Table 1).
Significantly regulated proteins also did not correspond to any of the
protein identifiers enriched in the whole protein set (Table S1) This
result demonstrates the added value of performing non-biased clustering
and STRING analysis in tandem.
When consolidating the list of 175 STRING IDs into a Markov Clustering
(MCL) network, 119 IDs were found to have at least one edge between
protein nodes included in this list. At an MCL inflation rate of 1.3,
proteins were separated into 7 networks that included all 23 significant
proteins which were connected to another node by least one edge (Figure
6). Networks 1 and 3 account for 13 of the significant proteins and
include 6 of the 8 most highly regulated proteins which were found in
the network map, defined as having an FC greater than 4 (Figure 7).
Network 1 was enriched for the keyword hydrolase and the protein domains
glutathione S-transferase, papain cysteine protease, and
thioredoxin-like. Network 3 was enriched for the keyword ligase and the
protein domains aldo-keto reductase, acetyl-CoA synthetase, Acetate-CoA
ligase, and NADP-dependent oxidoreductase. Other networks were
associated with terms associated to the overall list, such as cytochrome
C-oxidase for network 4 and short chain dehydrogenase/reductase SDR for
network 2 (Table 1), linking STRING networks with specific cellular
responses.
In a separate analysis, the query list of 277 unique protein ACs was
associated with 236 KEGG orthology (KO) identifiers. From the complete
list, over-representation of KO identifiers was greatest for KEGG
pathways 01220 (degradation of aromatic compounds), 00625 (Chloroalkane
and chloroalkene degradation), and 00982 (Drug metabolism - cytochrome
P450). For significantly up-regulated proteins over-representation was
greatest in the pathway 00053 (ascorbate and alderate metabolism) due to
the presence of UDP-glucoronosyltransferase (UGT) and aldehyde
dehydrogenase (ALDH). This pathway was especially important as it also
contains the significantly down-regulated MIOX protein (Figure 8). KEGG
over-enrichment for the full list can be linked predominantly to the
STRING network 3, which is enriched in KEGG pathways 01220, 00625, and
00053. The greatest over-representation of significantly down-regulated
proteins was in KEGG pathway 05100 (bacterial invasion of epithelial
cells), which was also seen greatly over-represented in network 6 due to
the significantly down-regulated proteins actin-related protein 2/3
complex subunit 1A, integrin-linked kinase, and cell division cycle
control protein 42, and dynamin-2 isoform X4.
Key kidney proteins associated with BW acclimation of fish were
identified based on statistical significance, degree of FC, correlation
with mRNA regulation, and their presence in central STRING networks and
KEGG pathways. Some of these proteins were represented by only one
paralog, but several proteins were found to have one paralog which was
significantly regulated, while corresponding paralogs had low-FC or were
regulated in the opposite direction. These proteins include the
up-regulated Von Willebrand factor A domain-containing (VFA) protein,
elongation factor A (EFA), ALDH and UGT, and the down-regulated proteins
hemoglobin and DHRS11 (Figure 9).