Bioinformatic analysis
The outputs from LC−MS raw data files were converted into mzXML format
and then processed using the XCMS, CAMERA and metaX toolbox implemented
with the R software. Each ion was identified by combining retention time
(RT) and m/z data. A three dimensional matrix containing arbitrarily
assigned peak indices (retention time-m/z pairs), sample names
(observations) and ion intensity information (variables) was generated
and intensities of each peaks were recorded. By matching the exact
molecular mass data (m/z) of samples with the online Kyoto Encyclopedia
of Genes and Genomes (KEGG), Human Metabolome Database (HMDB) database,
the metabolites were annotated. The molecular formula of metabolites
would further be identified and validated by the isotopic distribution
measurements, when a mass difference between observed and the database
value was less than 10 ppm. Also, an in-house fragment spectrum library
of metabolites to validate the metabolite identification was used.
The intensity of peak data was further preprocessed by metaX. Those
features detected in less than 50% of QC samples or 80% of biological
samples were removed, the remaining peaks with missing values were
imputed with the k-nearest neighbor algorithm for improving the data
quality. PCA analysis was conducted for outlier detection and batch
effects evaluation using the pre-processed dataset. Quality
control-based robust LOESS signal correction was fitted to the QC data
regarding the order of injection to minimize signal intensity drift over
time. The relative standard deviations of the metabolic features were
calculated using all QC samples, >30% of which were then
removed.
Student t-tests were employed to detect differences in metabolite
concentrations between two phenotypes. The P value was adjusted
for multiple tests using an FDR -P ≤0.05 (Benjamini–Hochberg).
Supervised PLS-DA was conducted through metaX to discriminate the
different variables between groups. The VIP value was calculated and a
VIP cut-off value of 1.0 was used to select important features.