Luca Zulberti

and 5 more

Modern computing platforms exploiting Coarse-Grained Reconfigurable Array architectures depend highly on the efficiency with which data are handled inside the architecture. Moving data is critical in computing-intensive systems to maximize energy efficiency and reduce latency. Access to the main memory is the most costly operation; therefore, the data retrieved must be kept near the processing elements of the architecture as long as possible to reduce data transfers. Modern algorithms involve very different access patterns to the main memory, requiring high versatility for Direct Memory Access (DMA) mechanisms. This work presents the SmartDMA architecture, a RISC-V-based programmable DMA controller specifically designed to perform adaptable memory access patterns and implement proper data reuse policies in CGRA-based systems. It comprises a set of Data Mover Engines (DMEs) that implement configurable 1D, 2D, and 3D data movements. Using a custom RISC-V ISA extension and a programmable event network, the application-specific firmware loaded on the SmartDMA can schedule DMA commands among all DMEs, ensuring that they are always busy with data transactions. We show a typical use case that takes advantage of CGRA-based processing and highlights the functionality of the SmartDMA. We synthesized the SmartDMA on TSMC 40nm low-power standard-cell technology at 350 MHz for three architectural configurations, increasing the number of DMEs, with a maximum memory throughput of 5.6 GB/s: small, medium, and large. The small configuration occupies 46.2k um2 of cell area and consumes 8.64 mW. The medium occupies 117k um2 and consumes 23.1 mW. The large one occupies 243k um2 and consumes 42.7 mW.

Matteo Monopoli

and 3 more

Artificial Intelligence has gained widespread adoption across different industrial sectors, serving as a versatile tool to carry out a diverse array of tasks, ranging from image classification and traffic forecasting to natural language processing and speech recognition. In the space domain, however, a special focus must be placed on area overhead, power consumption, and fault-tolerant solutions. In this particular scenario, soft GeneralPurpose Computing on Graphic Processing Units has the potential to revolutionise space-related activities. Indeed, by leveraging both Field Programmable Gate Array technology and Graphic Processing Unit computing, it becomes feasible to achieve highperformance capabilities without compromising neither power consumption nor radiation tolerance features. Moreover, the use of reconfigurable hardware can facilitate the acceleration of a wide range of Machine Learning algorithms, avoiding the drawbacks of excessive specialisation. This paper explores the State-of-the-Art in terms of hardware platforms for on-the-edge acceleration of Artificial Intelligence algorithms and compares it with a possible System-on-Chip implementation based on a softGraphic Processing Unit. Then, the attention is shifted towards the investigation of key aspects for future space missions, such as reliability and Dynamic Partial Reconfiguration. We point out the lack of European technological solutions, emphasising the promising potential offered by NanoXplore devices. We also discuss the importance of fault detection and mitigation techniques in space applications, covering the most commonly employed hardware methods for reliability enhancement and highlighting the lack of work in the field of General-Purpose Computing for Graphic Processing Units, especially in the space sector. Furthermore, we briefly examine the implementation of Dynamic Partial Reconfiguration over a System-on-Chip featuring a softGraphic Processing Unit IP. Finally, in the last section of the paper, we hint at future development of the project and conclude the work.