3.3 Single Cell Metabolomics
Single cell metabolomics (SCM) is designed to evaluate the metabolic profiles of cells with single cell resolution. Like many other single cell omics approaches, SCM provides a means of assessing heterogeneity between single cells.58 Majority of SCM pipelines rely on mass spectrometry-based methods to quantify the metabolomic profiles. As with all single cell methodologies, the first step is to isolate single cells. This can be accomplished while keeping the morphology intact using techniques such as FACS or microfluidic arrays, or by atomic force microscopy (AFM) which completely isolates the single cell’s metabolites in a probe. The next logical step is to quench all metabolic activity within the single cells. This is accomplished with the use of organic acids or solvents to denature enzymes and impede further conversion of metabolites, or by a technique known as snap freezing using liquid nitrogen to derail metabolic activity and promote membrane lysis. Snap freezing limits the use of reagents that could be considered contaminants in the downstream MS analysis but requires further workup to obtain pure metabolic profiles. The use of organic solvents in the quenching phase is beneficial because it also aids in metabolite extraction from the lysate. Depending on the type of MS used, different combinations of organic solvents are utilized. Guo et al. provide an excellent review on the appropriate use of solvent mixtures for each type of MS. After quenching metabolic activity and extracting the metabolites, samples are ionized. Ionization can be broken down into two distinct mechanisms, vacuum based or ambient methods. The review by Liu et al. provides an excellent summary of these sub variations including applications for each technique and data preprocessing/analysis strategies for the obtained MS spectra.59
M1 macrophages undergo metabolic reprogramming from oxidative phosphorylation to glycolysis upon phenotypical differentiation.60 Sustained M1 macrophage activation leads to sustained inflammation and can cause tissue damage if not appropriately regulated. The M2 phenotype is acquired after polarization with IL-4 and macrophage colony stimulating factor (M-CSF). M2 macrophages mediate the inflammatory response by secreting anti-inflammatory cytokines such as IL-10 and TGFβ. This gives them an important role in tissue repair after the host inflammatory response. Unlike the M1 phenotype, M2 macrophages rely upon the citric acid cycle (TCA) to support their production of ATP through oxidative phosphorylation. Phenotypical assays to assess macrophage population heterogeneity rely upon detection of cytokines or membrane bound surface markers. However, cytokines and surface markers are expressed in both phenotypes, thus presenting the need to develop a more specific phenotypical classification assay. To reliably quantify phenotypical differences, metabolic profiling with an LC-MS approach can be used, specifically, time of flight SIMS MS (TOF-SIMS) coupled with an Orbitrap analyzer (3D OrbiSIMS) which in contradiction to ordinary SIMS or TOF-SIMS, allows for MS/MS of the metabolite. TOF-SIMS provides highly localized spatial resolution of cell surfaces and does not require extensive sample preparation compared to most other LC-MS techniques. Historically, TOF-SIMS approach has not been reliably used to document endogenous metabolic profiles due to the poor mass resolving power. However, when coupled with the high mass resolving power and mass accuracy of an Orbitrap, characterizing metabolic profiles at the single cell level is possible. Using a targeted approach to analyze the lipid palette (matching lipid ion peaks to LIPID MAPS database), it was found that M1 macrophages had the highest lipid counts and different lipid composition compared to M2 or M0.38 The amino acid composition and other metabolites showed notable differences as well. Overall, the study represents a novel approach for in-situcharacterization of metabolic profiles to assess phenotypical differences between closely related cell types with single cell resolution.
HIGH THROUGHPUT IMAGING:
High throughput imaging (HTI) represents a robust set of methodologies which provide information on cellular morphology and includes large scale automated sample preparation and image analysis.61 All HTI workflows are based upon a targeted approach, where a dye, fluorescent reagent, fluorophore conjugated antibodies/oligonucleotides, or genetic construct which expresses fluorescent protein are used to label a particular component of the cell. This can include proteins, nucleic acids, or specific organelles within the cell. The workflow begins with perturbing cells from steady state, by the addition of ligands to activate cellular signaling pathways and thus inducing differential gene expression or by utilizing short hairpin RNA (shRNA) and clustered regularly interspaced short palindromic repeats (CRISPR/Cas9) which can respectively knockdown and knockout genes in a targeted manner, giving scientists the ability to diagnose the role of the labeled target in response to cellular perturbation. The ligand used to disrupt cellular steady state dictates the cellular pathways involved in regulating the change in phenotype for the target of interest.
Traditionally, drug discovery methodologies are target based, they rely on pre-cognition of a target molecule (interacting protein, or specific receptor), and assess the structural dynamics of “hit/lead” compounds that can bind the target and modulate its effector function.62 This approach typically involves screening the target against a library of compounds predicted to have high binding affinity potential based on knowledge of the binding pocket and or molecular docking simulations and assessing the mechanism of action for each compound in further downstream analysis experiments.63 The structure-based drug discovery is rapidly developing as artificial intelligence and deep learning algorithms progress.64 It should be noted that there is inherent bias with target-based screening methodologies, such that hit compounds designed for an individual target may have off-target effects (polypharmacology). To resolve this bias, phenotypic based drug discovery (PDD) methodologies start by looking at a cellular phenotype and attempt to correlate hit compounds that can modulate the phenotypic readout. Lin et al. provide a review that encompasses the details of workflow design and data analysis for each of these screening methodologies in greater detail.
HTI can also be used to profile compounds or libraries in a multiparameter fashion based on statistical analysis and clustering of compounds that elicit a similar phenotypic readout.65For example, a multiplexed image-based assay termed “Cell Painting” can measure ~1,500 morphological features pertaining to different combinations of size, shape, texture, staining intensity, and so on, in response to multiple perturbations.66 These perturbations can be chemically induced or could be a result of genetic manipulation (knockdown/knockout). The technique is capable of sensing subtle phenotypical differences and it groups perturbing agents (compounds/genes) based on similar pathways that they affect, thus shedding light on the markers of disease. The basic workflow involves genetic or chemical perturbation to a target cell line, staining cells with fluorescent dyes that label 8 different compartments within the cell (nucleus, F-Actin, plasma membrane, mitochondria, etc.), microscopy imaging, and image analysis where the morphological features taken from the image correlate to a profile that reflects the phenotypic state of the cell. Comparing profiles taken at different time points or with different perturbing agents can elucidate the mechanism of action and cellular signaling pathways pertaining to a particular phenotype.
Lastly, HTI methodologies can be classified into a third category, used for deep imaging which combines the use of fully automated high-resolution microscopy and sophisticated computational analysis of many images.67 Due to the massive amount of cellular input, this technique can assess rare cellular phenotypes that may only be present at low probabilities.68 This niche application of HTI is relatively unexplored, perhaps due to the limited availability of cost prohibitive high throughput imaging systems. However, even with the limited number of studies utilizing the deep imaging approach, the potential of HTI to identify rare cellular phenotypes in response to a small subset of perturbing agents represents an important area of study in systems biology, that is, phenotypical profiling of biomarkers of disease pathogenesis for rare and understudied host-pathogen interactions.
Nuclear factor kappa-light-chain enhancer of activated B cells (NF-κB) is a family of transcription factors that regulate innate and adaptive immune responses, cellular differentiation, proliferation, and apoptosis.69 The mammalian NF-κB family is dimeric in nature and includes five different protein monomers (p65/RelA, RelB, cRel, p50/105, and p52/100) that form either homo or heterodimers and each dimer differentially binds to DNA. All the monomers have a conserved N-terminal domain known as the Rel homology domain (RHD), which is essential for DNA binding, dimerization, nuclear localization, and inhibitor binding.70 NF-κB p105 and p100 proteins contain a IκB inhibitory domain which contains, multiple copies of the ankyrin repeat (ANK) at the C-terminus. Both NF-κB subfamily proteins (p105 and p100) undergo proteasome-dependent partial proteolysis to their active DNA binding forms (p50 and p52, respectively).71 NF-κB dimers reside in the cytoplasm bound to inhibitor proteins of the IκB family, where upon degradation of the inhibitor (phosphorylation of IκB-by-IκB kinase (IKK) followed by ubiquitylation and proteasomal degradation), NF-κB translocated into the nucleus where it binds DNA and stimulates transcription of target genes. Importantly, one of the target genes is the inhibitor itself which provides a mode of NF-κB signaling regulation via a negative feedback loop. With constant stimulation, the degradation of the inhibitor as well as NF-κB re-synthesis leads to oscillations of NF-κB nuclear translocation.72 Oscillations in NF-κB translocation are signal dependent, for example, sustained stimulation of cells with tumor necrosis factor a (TNFα) leads to oscillations, whereas a short one-time stimulation with TNFα leads to only one sharp peak of NF-κB translocation/activation.73 Induction of cells with LPS leads to an entirely different response, where NF-κB has been found to translocate in either one cycle of translocation, persistent translocation, or oscillations in the patterns of translocation.74 Since the translocation of NF-κB has been found to be stimulus-dependent, it is important to consider the kinetic and spatio-temporal landscape of NF-κB when studying pertinent signaling dynamics. These studies highlight the importance of high throughput imaging techniques for studying cellular signaling dynamics with real time quantification and showcase how HTI workflows can be utilized to elucidate the mechanisms underlying host-pathogen interactions.
COMPUTATIONAL SIMULATION AND MODELING:
Computational biology involves the assessment of complex biological systems through the development of computational models and simulations which can be used to develop predictive models of the factors involved in disease pathogenesis. The field is rapidly progressing with developments in computer hardware, software, and experimental methods, lowering the computational efforts required to produce these models. Computational models can be separated into 2 sub groups, quantitative and logical models. A quantitative model utilizes sets of differential equations to define the dynamics of the model which are typically non-linear. It requires pre-defined knowledge of details regarding the pathway or cellular event under study and is thus limited to modeling small portions of a well classified pathway. A logical model is based upon a Boolean system and qualitatively defines the dynamics of the model. It does not require a pre-defined knowledge of the system to be analyzed, and thus can be applied to large scale systems.
A sub field of computational biology utilizes both modeling approaches and resides at the intersection of systems biology and traditional bioinformatics, known as systems bioinformatics.75 The field of systems bioinformatics can be defined as the framework for integrating the multiomics landscape traditionally used in systems biology approaches, to provide insight into each individual omics layer and the cumulative interactions between them. In this way, systems bioinformatics provides methodologies capable of assessing the biological mechanisms of the entire interwoven system rather than the summation of each individual component or omics layer. The generation of this field is based around systems theory which is holistic in nature, and it’s use in systems bioinformatics is dependent on graph theory, network science, and other mathematical approaches which facilitate the analysis of complex networks derived from the system of interest.
Mathematical models are fundamental for analyzing network topology and kinetics. As multiomics based quantification methods advance in both high throughput ability and sensitivity, more accurate parameters can be fed into models and provide more accurately quantifiable simulations of signaling dynamics. Networks can also be applied to qualitative models of pathway modeling. Many open access platforms are available for that purpose. These software work by taking an input list of gene symbols or protein names and assessing their gene ontology (GO) to map them. For example, Cytoscape facilitates the visualization of complex biological networks with annotated gene symbols and expression data.76 Reactome enhanced pathway visualization is another peer reviewed alternative.77 There are several pathway databases which allow for the visualization of signaling components based upon GO terms. A review comparing some of the most widely used databases can be found here.78 Another detailed review on construction and analysis of biological pathways can be found here.79
Construction of networks to showcase biological pathways can incorporate both mathematical modeling functions and qualitative visualization to help researchers fully understand the molecular dynamics involved in cell signaling. One such software is Simmune, that generates computational models incorporating the spatially resolved reaction-diffusion networks. Simmune utilizes rule-based approaches to lower computational complexity of simulations. This involves pre-programming of the simulation with fundamental signaling components (important proteins) and their pair-wise interactions which allows the computer to assemble the complexes that constitute the signaling network.80 This rule-based approach was incorporated as a response to one of the most traditional challenges in pathway modeling- combinatorial explosion. Combinatorial explosion can occur when there is excessively high number of alternative interactions arising from a network consisting of many different signaling components, or individual components that have multiple binding sites and thus many possible interactions. Simmune works by generating a local network in a multi-step fashion. The first step involves the construction of a non-spatial network that includes every possible molecular interaction for each fundamental signaling component. This ‘template network’ is then adjusted to reflect the local molecular environment which lowers computational extensivity of the simulation. Simmune is also able to account for morphologically dynamic models which usually requires rebuilding the network every time the cellular morphology changes, to account for spatial constraints concerning the receptor ligand interactions during membrane fluctuation.In-silico approaches for assessing pathway dynamics represent an important stepping stone in systems bioinformatics and all related disciplines to either validate or predict experimental findings. In a study put forth by Manes et al., the chemo sensing pathway Sphingosine-1-phosphate (S1P) was explored with a combinatorial approach of RNA sequencing, targeted proteomics, and Simmune based modeling for computational validation.81,82 This highlights the importance of in silico based computational models in providing insights into molecular mechanics of complex signaling networks and validating the experimental findings.
Protein folding simulations such as AlphaFold2 are also an informative tool for predicting and assessing protein structure and can be used to diagnose the protein’s effector function.83 AlphaFold2 can construct a 3-dimesnional representation of how a protein will fold, based upon its primary sequence. The software uses a deep learning-based algorithm with multi-sequence alignment which incorporates both physical and biological knowledge regarding protein structure.84 In addition to the software’s abilities, the AlphaFold team has now released accurate structure predictions for human proteome in a freely available database.85 The availability of a database with highly accurate protein structures that are continuously updated is a major step forward for the field of structural biology as it takes away the burden of generating these structural models from scratch.86 As AlphaFold2 evolves along with the database, we can expect to see more structural predictions that are publicly available and provide researchers with tools that can be exploited for drug discovery, investigating heteromeric protein-protein interactions, and creating simulations of pathway dynamics where each component of the signaling pathway includes a highly accurate 3-dimensional model of its native conformational shape.87 This will greatly benefit the field of systems biology as better structural predictions of the human proteome can help researchers assess all the possible functions of a protein and build more complex models regarding their kinetics.
INTEGRATING OMICS APPROACHES: