6.1 Integrating MSM and ML to elucidate the emergence of function in complex systems
We are riding the wave of a paradigm shift in the development of MSM methods due to rapid development and changes in HPC infrastructure (see Figure 2) and advances in ML methods. Thus, MSM and HPC have emerged as essential tools for modeling complex problems at the microscopic scales with a focus on leveraging the structured and embedded physical laws to gain a mechanism-based understanding. This success notwithstanding, the design of new MSM algorithms in coupling different scales, data utilization, and their implementation on HPC is becoming increasingly cumbersome in the face of heterogeneous data availability and rapidly evolving HPC architectures and platforms. On the other hand, while purely data-driven models of molecular and cellular systems spawned by the techniques of data science [132-134], and in particular, ML methods including deep learning methods [135-137], are easy to train and implement, the underlying model manifests as a black-box. This general approach taken by the ML community is well suited for classification, learning, and regression problems, but suffers from limitations in interpretability and explainability, especially when mechanism-based understanding is a primary goal. There lies a vast potential in combining MSM, HPC, and ML methods with their complementary strengths [4]. MSM models are routinely coupled together by appropriately propagating information across scales (see section 5), while the ever-increasing advances in hardware capabilities and high-performance software implementations allows us to study increasingly more complex phenomena at a higher fidelity and higher resolution. While much of the discussion thus far has been focused on MSM and HPC methods, the progress and potential in integrating MSM and ML are discussed below and represent the forefront of emerging MSM research, in which we discuss a few emerging integrative approaches to combine ML and MSM.