Ocean metaproteomics provides valuable insights into the structure and function of marine microbial communities. Yet, ocean samples are challenging due to their extensive biological diversity that results in a very large number of peptides with a large dynamic range. This study characterized the capabilities of data independent acquisition (DIA) mode for use in ocean metaproteomic samples. Spectral libraries were constructed from discovered peptides and proteins using machine learning algorithms to remove incorporation of false positives in the libraries. When compared with 1-dimensional and 2-dimensional data dependent acquisition analyses (DDA), DIA outperformed DDA both with and without gas phase fractionation. We found that larger discovered protein spectral libraries performed better, regardless of the geographic distance between where samples were collected for library generation and where the test samples were collected. Moreover, the spectral library containing all unique proteins present in the Ocean Protein Portal outperformed smaller libraries generated from individual sampling campaigns. However, a spectral library constructed from all open reading frames in a metagenome was found to be too large to be workable, resulting in low peptide identifications due to challenges maintaining a low false discovery rate with such a large database size. Given sufficient sequencing depth and validation studies, spectral libraries generated from previously discovered proteins can serve as a community resource, saving resequencing efforts. The spectral libraries generated in this study are available at the Ocean Protein Portal for this purpose.

Michael Jakuba

and 3 more

We report the design and results from a series of recent cruises using a fast vertical profiling autonomous underwater vehicle called Clio. Clio has been designed specifically to complement conventional wire-based sampling techniques—to improve ship-time utilization by operating simultaneously and independently of conventional techniques, and thereby to cost-effectively improve the understanding of marine microorganism ecosystem dynamics on a global scale. Life processes and ocean chemistry are linked: ocean chemistry places constraints on marine metabolic processes, and life processes alter the speciation, chemical associations, and water-column residence time of seawater constituents. Advances in sequencing technology and in situ preservation have made it possible to study the genomics (DNA), transcriptomics (RNA), proteomics (proteins and enzymes), metabolomics (lipids and other metabolites), and metallomics (metals), associated with marine microorganisms; however, at present these techniques require sample collection. For this purpose, Clio’s primary payload consists of two Suspended-Particle Rosette (SUPR) multi-samplers capable of returning up to 20 sets of filtered samples and filtrate per dive, and filtering up to 280 L of water per sample. Clio hosts additional profiling sensors consisting presently of a Seabird Electronics CTD, WET Labs combined chlorophyll and backscatter fluorimeter, and C-Star transmissometer. Since sea trials in 2017 Clio has participated in 5 cruises including most recently a section cruise between Bermuda and Woods Hole in June of 2019. On that cruise Clio executed a total of 9 nightly dives 12-16 hours in length and filtered a total of 20,878 L of seawater. The vehicle holds depth to a precision of better than 5 cm, is rated to 6000 m (4100 m maximum depth to date) and transits the water column at 45 m/min. Clio has demonstrated consistent reliable performance in its intended role; however, opportunities exist to further exploit its capabilities. Clio’s last two dives included autonomous data-driven selection of sample depths to better capture the deep chlorophyll maximum. Clio’s large payload capacity (10s W, 10s kg) could host novel samplers as well as in situ sample processors and other profiling instruments.