Introduction
GridMol is designed as a grid application for molecular modeling and visualization.1 It comprises a typical browser and server platform based on Java and Java3D, which have “write once and run anywhere” capability.2,3 User’s computing tasks can be submitted to high-performance computers (HPCs) as jobs via China National Grid (CNGrid)4,5 or Scientific Computing Grid (ScGrid)6, which provides unified high performance computing services based multiple heterogeneous HPCs. CNGrid has aggregated more than 460 petaflops of computing capability and 310 million gigabytes of storage from 19 supercomputing sites in China, including Sunway TaihuLight (No. 3 in 2019 HPC TOP500 list) and Tianhe-2A (No. 5 in 2019 HPC TOP500 list), and etc. 7ScGrid has aggregated more than 215 petaflops of computing capability from 10 institutes of the Chinese Academy of Sciences (CAS). Both of CNGrid and ScGrid provide RESTful APIs to provide services, including unified user authentication and authorization, job submitting and checking, secure user data management, and etc. . Based on the grid environment, GridMol focuses on providing ”one-stop” computational chemistry solutions, including molecular modeling, visualization, animation, job submission, management, and results analysis.1,8
The main features of GridMol version 1.0 include molecular modeling, visualization, and job submission.1,9 GridMol allows all basic molecular modeling functions, including modification of bond length, angle, and dihedral angle, as well as addition or modification of atoms and radicals. Further, users can use GridMol to read files of common chemistry software formats, such as PDB, MOL2, GJF, and XYZ, and visualize three-dimensional (3D) structures via different display formats, including tube, ribbon, and cartoon models, for large molecules and other formats, such as line, ball-and-stick, and space-filling models, for molecules.3 GridMol allows easy access for users (those without an account can register for free through http://www.cngrid.org/) to launch a job on remote HPCs. While the job runs, GridMol monitors its status and provides users with the ability to terminate the job via the RESTful APIs provided by CNGrid or ScGrid. After several years of development, GridMol now represents a powerful research tool.
GridMol version 2.0 introduces new and unique implementations, the most important of which are fragment-based linear scaling quantum chemistry methods (FLSMs) and grid-based interactive visualization. FLSMs offer a unique solution for calculations involving large molecules by decomposing a large molecular system into subsystems that can be calculated at quantum mechanical (QM) levels, as well as representing properties of intractable super systems as a reassembly of individual fragments.10-13 Among many FLSMs, molecular fractionation with conjugate caps (MFCC)14 and fragment molecular orbital (FMO)15 methods (both described later) are selected for implementation in GridMol. This is important for generalized automation of fragment-based calculation for the following reasons:
1) it addresses computational limitations associated with performing electrostatic calculations on large molecules by allowing an accurate, automated fragmentation step to do something cumbersome and inaccurate when done manually;
2) it increases the ease of input-file preparation for QM procedures by enabling performance of FLSMs, even without expert knowledge; and
3) it allows users to conveniently utilize HPCs to speed up the computational process, especially given that substructures can be treated separately through parallel calculations; thus, the computational time for fragments is ~equivalent to that required for the largest fragment.
Another important feature is grid-based interactive visualization. When performing molecular calculations, such as structure optimization (OPT) and intrinsic reaction coordinate (IRC) calculation, which involve multiple self-consistent steps, users usually need to check intermediate results in order to examine structure and/or energy evolution, especially when establishing new systems.16-18Monitoring the calculation in real time allows users to identify problems while not wasting computing time or resources. Although graphical tools exist for analyzing the results calculated by chemistry software (e.g., GaussView for Gaussian, VMD for NAMD, MS Visualizer for Materials Studio),19-21 these are desktop applications, unable to connect to several remote HPCs via simple APIs rather than through open ports. Although GridChem can launch and monitor calculations on supercomputers from remote sites, it is still a desktop application.22 GridMol is a web-based application that can securely launch jobs on multiple remote HPCs via via the RESTful APIs. Additionally, it is difficult to visualize results on HPCs directly, so process monitoring is necessary for visualizing intermediate computational results. Due to the cross-platform properties and the simple API associated with CNGrid or ScGrid enviroment, process monitoring is convenient for different remote HPCs with minimal lag time and enables identification of simulation problems in time to either terminate or restart a job.
Moreover, other minor updates, including interfaces for input-file preparation for chemistry software and extensive result analysis, have been introduced. In this paper, we provide two practical examples in order to illustrate FLSM applications in GridMol. For MFCC, we assessed the performance of different density functional theory (DFT) methods for calculating ligand–protein-binding energies. Furthermore, we used FMO, combined with transition-state (TS) calculation, to provide a new solution for studying the dissociation mechanism related to ligand-targeted drugs (LTDs).23