Table 4 : Resources on HPC training, resources, community
In Table 2, we mentioned the development of parallel algorithms, which led to a transformation in multiphysics simulations: some examples include parallel matrix operations and linear algebra (https://cvw.cac.cornell.edu/APC/); parallel implementation of the N-body problem with short-range interactions (https://cvw.cac.cornell.edu/APC/); long-range interactions and the parallel particle mesh Ewald sum [49]; parallel Monte Carlo [50]; linear-scaling methods such as multipole expansion [49]; linear-scaling density functional theory [51]; parallel graph algorithms (https://cvw.cac.cornell.edu/APC/). As a specific example, we note that the N-body problem is an essential ingredient in MD. A common goal in MD of large systems is to perform sufficient sampling of the combinatorially large number of conformations available to even the simplest of biomolecules [52, 53]. In this respect, a potential disadvantage of molecular dynamics calculations is that there is an inherent limitation upon the maximum time step used for the simulation (≤ 2 fs). Solvated systems of biomolecules typically consist of 105-106 atoms. For such system sizes, with current hardware and software, simulation times extending into the tens of microseconds regime is an exceedingly labor-intensive and challenging endeavor that requires a combination of algorithmic enhancements as well as the utilization of high-performance computing hardware infrastructure. For example, cutoff distances reduce the number of interactions to be computed without loss of accuracy for short-range interactions but not for long-range (electrostatic) interactions; long-range corrections such as the particle mesh Ewald algorithm [54] along with periodic boundary conditions are typically implemented for maintaining accuracy. Parallelization techniques enable the execution of the simulations on supercomputing resources such as 4096 processors of a networked Linux cluster. Although a cluster of this size is a big investment, its accessibility is feasible through the extreme science and engineering discovery environment (XSEDE) for academic researchers. XSEDE resources (www.xsede.org) currently include petaflop of computing capability, and other US national laboratories such as the Oakridge are moving towards exascale computing (https://www.exascaleproject.org) [55]. Another approach, capitalizing on advances in hardware architecture, is creating custom hardware for MD simulations, and offers one-two orders of magnitude enhancement in performance; examples include MDGRAPE-3 [56, 57] and ANTON [58, 59]. Graphical processing unit (GPU) accelerated computation has recently come into the forefront to enable massive speed enhancements for easily parallelizable tasks with early data indicating that GPU accelerated computing may allow for the power of a supercomputing cluster in a desktop, see e.g., [60, 61].