Introduction
GridMol is designed as a grid application for molecular modeling and
visualization.1 It
comprises a typical browser and
server platform based on Java and Java3D, which have “write once and
run anywhere” capability.2,3 User’s computing tasks
can be submitted to high-performance computers (HPCs) as jobs via China
National Grid (CNGrid)4,5 or Scientific Computing Grid
(ScGrid)6, which provides unified high performance
computing services based multiple heterogeneous HPCs. CNGrid has
aggregated more than 460 petaflops of computing capability and 310
million gigabytes of storage from 19 supercomputing sites in China,
including Sunway TaihuLight (No. 3 in 2019 HPC TOP500 list) and
Tianhe-2A (No. 5 in 2019 HPC TOP500 list), and etc. 7ScGrid has aggregated more than 215 petaflops of computing capability
from 10 institutes of the Chinese Academy of Sciences (CAS). Both of
CNGrid and ScGrid provide RESTful APIs to provide services, including
unified user authentication and authorization, job submitting and
checking, secure user data
management, and etc. . Based on the grid environment, GridMol
focuses on providing ”one-stop”
computational chemistry solutions, including molecular modeling,
visualization, animation, job submission, management, and results
analysis.1,8
The main features of GridMol version 1.0 include molecular modeling,
visualization, and job submission.1,9 GridMol allows
all basic molecular modeling functions, including modification of bond
length, angle, and dihedral angle, as well as addition or modification
of atoms and radicals. Further, users can use GridMol to read files of
common chemistry software formats, such as PDB, MOL2, GJF, and XYZ, and
visualize three-dimensional (3D) structures via different display
formats, including tube, ribbon, and cartoon models, for large molecules
and other formats, such as line, ball-and-stick, and space-filling
models, for molecules.3 GridMol allows easy access for
users (those without an account can register for free through
http://www.cngrid.org/) to launch a job on remote HPCs. While the job
runs, GridMol monitors its status and provides users with the ability to
terminate the job via the RESTful APIs provided by CNGrid or ScGrid.
After several years of development, GridMol now represents a powerful
research tool.
GridMol version 2.0 introduces new and unique implementations, the most
important of which are fragment-based linear scaling quantum chemistry
methods (FLSMs) and grid-based interactive visualization. FLSMs offer a
unique solution for calculations involving large molecules by
decomposing a large molecular system into subsystems that can be
calculated at quantum mechanical (QM) levels, as well as representing
properties of intractable super systems as a reassembly of individual
fragments.10-13 Among many FLSMs, molecular
fractionation with conjugate caps (MFCC)14 and
fragment molecular orbital (FMO)15 methods (both
described later) are selected for implementation in GridMol. This is
important for generalized automation of fragment-based calculation for
the following reasons:
1) it addresses computational limitations associated with performing
electrostatic calculations on large molecules by allowing an accurate,
automated fragmentation step to do something cumbersome and inaccurate
when done manually;
2) it increases the ease of input-file preparation for QM procedures by
enabling performance of FLSMs, even without expert knowledge; and
3) it allows users to conveniently utilize HPCs to speed up the
computational process, especially given that substructures can be
treated separately through parallel calculations; thus, the
computational time for fragments is ~equivalent to that
required for the largest fragment.
Another important feature is grid-based interactive visualization. When
performing molecular calculations, such as structure optimization (OPT)
and intrinsic reaction coordinate (IRC) calculation, which involve
multiple self-consistent steps, users usually need to check intermediate
results in order to examine structure and/or energy evolution,
especially when establishing new systems.16-18Monitoring the calculation in real time allows users to identify
problems while not wasting computing time or resources. Although
graphical tools exist for analyzing the results calculated by chemistry
software (e.g., GaussView for Gaussian, VMD for NAMD, MS Visualizer for
Materials Studio),19-21 these are desktop
applications, unable to connect to several remote HPCs via simple APIs
rather than through open ports. Although GridChem can launch and monitor
calculations on supercomputers from remote sites, it is still a desktop
application.22 GridMol is a web-based application that
can securely launch jobs on multiple remote HPCs via via the RESTful
APIs. Additionally, it is difficult to visualize results on HPCs
directly, so process monitoring is necessary for visualizing
intermediate computational results. Due to the cross-platform properties
and the simple API associated with CNGrid or ScGrid enviroment, process
monitoring is convenient for different remote HPCs with minimal lag time
and enables identification of simulation problems in time to either
terminate or restart a job.
Moreover, other minor updates, including interfaces for input-file
preparation for chemistry software and extensive result analysis, have
been introduced. In this paper, we provide two practical examples in
order to illustrate FLSM applications in GridMol. For MFCC, we assessed
the performance of different density functional theory (DFT) methods for
calculating ligand–protein-binding energies. Furthermore, we used FMO,
combined with transition-state (TS) calculation, to provide a new
solution for studying the dissociation mechanism related to
ligand-targeted drugs (LTDs).23