Establishing A Comprehensive Workflow for Extracting MS1 Isotope
Distributions in LC-MS/MS Proteomics
Abstract
The continuous advancements in LC-MS/MS proteomics over the past decades
have paved the way for transformative changes in the field of medicine,
particularly in the realms of preventive and personalized healthcare.
Many new algorithms are evaluated on unknown proteomes and using
databases with annotated MS2-spectra. When the research is focused on
MS1-spectra, such databases are not available yet. Specifically, we
propose a comprehensive workflow to extract MS1 isotope distributions
from spectra, which we validated using a proteomics standard kit
comprising known proteins at varying concentrations in duplicate. Our
workflow incorporated a database search utilizing a state-of-the-art
algorithm at 1% FDR. Through this approach, we investigated the impact
of protein concentration on the probability of protein identification.
Confidently identified PSMs were used to extract the MS1 isotope
distributions through the proposed workflow. A total of 138.111 MS1
isotope distributions were extracted. Isotope distributions with 2 or
more peaks were compared with their theoretical isotope distributions
using the spectral angle. A median spectral angle of 0,101 and 0,0992
was observed in both samples indicating a high similarity. The findings
from this study were compiled into a dataset which can potentially
facilitate the development of novel tools with a focus on MS1 data.