Introduction
Plant metabolome analysis is now of crucial importance to describe responses to environmental conditions and thus understand associated metabolic mechanisms. Standardised metabolomics protocols have been proposed to characterise crop metabolism (Zheng, Johnson, Mandal, & Wishart, 2021). The term “metabolomics” is used to refer to techniques exploited to investigate “small” biological molecules (metabolites), i.e., extractible molecules with limited molecular weight, usually less than 500 atomic mass units (a.m.u.) (Roessner & Bowne, 2009). Two mainstream mass spectrometry techniques can be used: gas chromatography coupled to mass spectrometry (GC-MS) and liquid chromatography coupled to mass spectrometry (LC-MS) (Allwood et al., 2011; Perez de Souza, Alseekh, Naake, & Fernie, 2019). The output of metabolomics is a dataset of metabolic features (m/z metabolite peaks with retention time in LC-MS; analytes resulting from metabolite derivatisation in GC-MS) that can be used for statistics and detect metabolic changes between samples.
GC-MS metabolomics (also sometimes referred to as metabolic profiling) has been used extensively in plants under many conditions, species, or genetic backgrounds: a simple search with a literature database with query keywords ‘plant’, ‘metabolomics’ and ‘gc-ms’ returns 43,700 entries. When restricted to Arabidopsis, it returns 13,500 entries. It shows the massive utilisation of GC-MS for plant physiology and molecular biology. In particular, this technique is very useful to have, in a single sample analysis, a relative quantitation of most important metabolites of plant primary metabolism, such as amino acids, small soluble sugars, polyamines, or organic acids. It has thus been used to describe the response of C and N primary metabolism to major environmental cues, for example, herbivores (Jansen et al., 2009), CO2 mole fraction (Högy, Keck, Niehaus, Franzaring, & Fangmeier, 2010; Misra & Chen, 2015), drought (Bowne et al., 2012; Sanchez, Schwabe, Erban, Udvardi, & Kopka, 2012), nutrient conditions (Cui, Abadie, Carroll, Lamade, & Tcherkez, 2019; Cui, Davanture, Lamade, Zivy, & Tcherkez, 2021; Cui, Davanture, Zivy, Lamade, & Tcherkez, 2019), or abiotic stress combinations (Ghatak, Chaturvedi, & Weckwerth, 2018; Nakabayashi & Saito, 2015; Shulaev, Cortes, Miller, & Mittler, 2008).
To date, the vast majority of GC-MS analyses for metabolic profiling utilise nominal mass acquisition (i.e. at a.m.u. resolution). Accordingly, databases associated with GC-MS metabolomics such as the Golm Metabolomics Database (GMD) provide spectral data at a.m.u. resolution, and in a recent review of metabolomics resources, only nominal mass databases are discussed for GC-MS analyses (Vinaixa et al., 2016). In other words, to our knowledge, there is no directly accessible, high resolution (exact mass, i.e. at 0.0001 a.m.u. resolution or lower) and comprehensive GC-MS resource for plant metabolomics. This lack of curated, accessible and available resource for GC-MS analyses has three origins: (i ) The availability of (affordable) exact mass GC-MS instruments is relatively recent, since the implementation of the orbitrap technology took place in the 2000s (Makarov, 2000; Makarov, Denisov, & Lange, 2009; Peterson, McAlister, Quarmby, Griep-Raming, & Coon, 2010) and the description of standard practices for high resolution GC-MS analyses has been proposed in 2021 only (Misra, 2021); (ii ) Many ordinary applications of GC-MS metabolomics profiling do not require exact mass resolution since they are targeted on common, well-known compounds; and (iii ) Whenever high resolution is required, LC-MS can be used. The use of high resolution in LC-MS may be important using full scan analyses, because there is a limited number of fragments (mainly parental ion and adducts) and therefore, identification essentially relies on both exact mass and isotopic pattern (De Vos et al., 2007; Kind & Fiehn, 2006). By contrast, in GC-MS analyses, the fragmentation pattern along with the retention index are used to identify analytes, with generally good accuracy. Several tools have been recently proposed to automatically annotate ions or fragments in mass spectra, in particular from LC-MS spectral data, for example in (Doerfler et al., 2014; Gaquerel, Kuhl, & Neumann, 2013; Matsuda et al., 2011; Qiu, Fine, Wherritt, Lei, & Sumner, 2016).
However, there are circumstances where high mass resolution may be desirable with GC-MS, since (i ) several compounds with the same retention time could generate fragments with the same nominal mass and (ii ) it could be useful to distinguish isotopic species (isotopologues) using their mass difference (for example, there is a mass excess of +1.003355 Da with 13C while it is +1.006277 Da with 2H), and (iii ) one may desire to perform untargeted GC-MS analyses with broad chemical coverage. Here, we describe an exact mass GC-MS method for high resolution routine plant metabolic profiling and provide the associated curated database, checked with authentic standards. This allows us to address aspects (i ) and (ii ) directly. We also provide the list of current compounds having similar nominal-mass fragments and similar retention time but can be distinguished easily with exact mass, avoiding quantification errors. We also take advantage of sulphur isotopes at natural abundance to allow the identification of S-containing fragments in datasets. Finally, we applied our protocol and the database using Arabidopsis leaves to show how it can be applied to real samples, allowing facile differentiation of genetic accessions.