4.1 Multicore architecture and the decline of Moore’s
law
The linear trends in Figure 2, right ceases to hold beyond 2007
prediction due to the power wall in chip architecture. The industry was
forced to find a new paradigm to sustain performance enhancement. The
viable option was to replace the single power-inefficient processor with
many efficient processors on the same chip, with increasing numbers of
processors, or cores, each technology generation every two years. This
style of the chip was labeled a multicore microprocessor. Hence, the
leap to multicore is not based on a breakthrough in programming or
architecture and is a retreat from building power-efficient,
high-clock-rate, single-core chips [47]. The emergence of the
multicore architecture in 2005, prompted shared memory architectures and
the establishment of the application programming interface (API), OpenMP
(Open Multi-Processing) standard, which supports multi-platform shared
memory multiprocessing programming in C, C++, and Fortran (OpenMP.org).
One of the main drawbacks of MIMD platforms is the high cost of
infrastructure. The alternative to MIMD platforms is single instruction
multiple data (SIMD) architectures. With the increase of the
computational power and multicore options at the desktop level, and the
low costs of the new processing architectures such as the graphical
processing units, the attention of the scientific community has moved
back to SIMD platforms [48]. The use of GPUs in scientific computing
has exploded enabled by programing and instructional languages like CUDA
(Compute Unified Device Architecture), a parallel computing platform and
application programming interface (API) model created by Nvidia
(https://en.wikipedia.org/wiki/CUDA). CUDA allows software
developers and software engineers to use a CUDA-enabled graphics
processing unit (GPU) for general-purpose processing – an approach
termed GPGPU (General-Purpose computing on Graphics Processing Units).
The CUDA platform is a software layer that gives direct access to the
GPU’s virtual instruction set and parallel computational elements for
compute kernels. Challenges for researchers utilizing HPC platforms and
infrastructure range from understanding emerging new platforms to
optimizing algorithms in massively parallel architectures to efficiently
access and handle data at a large scale. The HPC technology is rapidly
evolving and is synergistic yet complementary to the development of
scientific computing: some useful links on HPC reviews, training,
community, and resources are summarized in Table 4.