Legend: Id. – database ID as designated in this study; Strain
– description as denoted in GenBank; Level – current genome status as
provided in GenBank; Mb – size of the genome assembly; BioProject – is
provided by hypertext link; Protein seq – number of proteins available
for the BioProject, *note that PRJNA362897 with 16,238 proteins consists
of four genome assemblages.
Figure 1 UpSet analysis of the proteomic analysis using
a wide database. The presentation shows consensus of protein hits
identified in particular components of the database. The result
indicates large differences in identification.