4.1 Multicore architecture and the decline of Moore’s law
The linear trends in Figure 2, right ceases to hold beyond 2007 prediction due to the power wall in chip architecture. The industry was forced to find a new paradigm to sustain performance enhancement. The viable option was to replace the single power-inefficient processor with many efficient processors on the same chip, with increasing numbers of processors, or cores, each technology generation every two years. This style of the chip was labeled a multicore microprocessor. Hence, the leap to multicore is not based on a breakthrough in programming or architecture and is a retreat from building power-efficient, high-clock-rate, single-core chips [47]. The emergence of the multicore architecture in 2005, prompted shared memory architectures and the establishment of the application programming interface (API), OpenMP (Open Multi-Processing) standard, which supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran (OpenMP.org).
One of the main drawbacks of MIMD platforms is the high cost of infrastructure. The alternative to MIMD platforms is single instruction multiple data (SIMD) architectures. With the increase of the computational power and multicore options at the desktop level, and the low costs of the new processing architectures such as the graphical processing units, the attention of the scientific community has moved back to SIMD platforms [48]. The use of GPUs in scientific computing has exploded enabled by programing and instructional languages like CUDA (Compute Unified Device Architecture), a parallel computing platform and application programming interface (API) model created by Nvidia (https://en.wikipedia.org/wiki/CUDA). CUDA allows software developers and software engineers to use a CUDA-enabled graphics processing unit (GPU) for general-purpose processing – an approach termed GPGPU (General-Purpose computing on Graphics Processing Units). The CUDA platform is a software layer that gives direct access to the GPU’s virtual instruction set and parallel computational elements for compute kernels. Challenges for researchers utilizing HPC platforms and infrastructure range from understanding emerging new platforms to optimizing algorithms in massively parallel architectures to efficiently access and handle data at a large scale. The HPC technology is rapidly evolving and is synergistic yet complementary to the development of scientific computing: some useful links on HPC reviews, training, community, and resources are summarized in Table 4.