Literature Review

In the study conducted by R. Leupers et al. [1] , the importance of software in digital signal processing (DSP) applications was examined, highlighting the need for automated tools to support DSP based software development. Techniques for high-level block-diagram-based modeling of DSP applications and translating block-diagram specifications into efficient C programs through global target-independent optimization methods, and compiling C programs into optimized machine code for programmable DSP processors were reviewed.Ansong Ni et al. [2] introduced LEVER, a straightforward approach to enhance language-to-code generation by training verifiers to assess program correctness based on natural language input, the program itself, and execution results. The sampled programs were ranked by integrating the verification score with the LLM generation probability. LEVER achieved notable improvements ranging from 4.6% to 10.9% with code-davinci-002 across four datasets containing samples from table QA, math QA, and fundamental Python programming domain. Chanchai Supaartagorn et al. [3] introduced an automatic code generator tool based on structured flowcharts, comprising basic shapes that can be combined to form structured flowcharts convertible into source codes. The tool’s performance was evaluated with two groups: 5 experts and 93 general users. Results demonstrated high satisfaction levels among both groups, with average values of 4.48 and 4.27 and standard deviations at 0.59 and 0.64 for experts and general users respectively. The tool demonstrated agreeable performance. Tomasz Szydło et al. [4] noticed that programming libraries often demand excessive resources, making them impractical for deployment on embedded processors. With their research they introduced the concept of source code generation for machine learning models, along with algorithms for generating commonly utilized machine learning methods. The effectiveness of this concept has been validated through various use cases.
Batuhan Aşıroğlu et al. [5] observed that the web design process starts with creating mock-ups for individual web pages, either manually or using graphic design tools. These mock-ups are then converted into structured HTML or similar code by software engineers, undergoing refinements until the desired template is achieved. Their research aimed to automate the code generation process from hand-drawn mock-ups, using computer vision techniques and incorporating select deep learning methods. Their system demonstrated a method accuracy of 96% and a validation accuracy of 73%. Samantha Ray et al. [6]observed that the existing user interface driven solutions for creating flowcharts present challenges to learners with intricate drag-and-drop menus, while sketching-based alternatives lack support beyond initial pseudocode generation. The researchers created Flow2Code to facilitate the translation of hand-drawn flowcharts into code. Flow2Code recognizes various flowchart shapes, and then converts the flowcharts into executable code. It has an intuitive, interactive interface for users to modify both their flowchart and resulting code. Humans have the ability to comprehend technical documents effortlessly. Researchers have made several attempts to impart such comprehension capabilities to AI-based systems. Research done by N. G. Bourbakis et al. [7]focused on the utilization of the interaction between two technical document (TD) modalities: block diagrams and associated natural language text to develop a system for generating pseudocode that defines the functionality of the system under examination automatically. The methodology involves mapping the TD modalities into Stochastic Petri-nets (SPN) to enhance system diagrams that are used for pseudocode generation. With this method the researchers aimed to achieve automatic deep comprehension of technical documents. Aspects like the use of diagram images and the automated understanding of mathematical formulas in technical documents remain relatively understudied.Gkorgkolis Nikolaos et al. [8] introduced a new formal scheme for modeling digital diagram images, extending to a generative framework for creating artificial images and annotations. They proposed a method to convert the pseudocode generation problem into an image captioning task, employing a range of techniques based on adaptive image partitioning. They addressed semantic understanding of mathematical formulas by conducting an evaluative survey, which was followed by the introduction of a formal synthesis framework that utilized formula graphs as metadata to produce valuable formulas. This synthesis framework is validated using a deep geometric learning mechanism that utilizes formula data to simulate missing a priori knowledge.
Enrique Dehaerne et al. [9] analyzed 37 publications sourced from the arXiv and IEEE Xplore databases, which were based on projects in which ML models were trained on programming language data to produce code. They identified three main paradigms of code generation: description-to-code, code-to-description, and code-to-code. These papers primarily focused on ML applications such as generating code from natural language descriptions, documentation generation, and automatic program repair. Commonly used ML models for these research projects include recurrent neural networks, transformers, and convolutional neural networks, along with various other neural network architectures and non-neural techniques. Comparisons of model types, tokenizers, data volume and quality, and evaluation methods for synthesized code were also discussed in this review. Presently, researchers are focused on generating code from requirement documents; however, existing methods often struggle with requirements that demand intricate problem-solving abilities. Zejie Liu et al. [10] introduced a novel method for generating source code from flowcharts along with textual descriptions. The researchers manually curated a benchmark dataset comprising 320 flowcharts paired with their source codes. Adapting existing approaches to this new task has its challenges due to the distinctive nature of flowcharts containing various elements and the multiple connections between nodes within them. To address these challenges, The researchers have proposed a two-stage code generation model. In the first stage, a structure recognition algorithm is used to translate the flowchart into pseudo code. In the second stage, a code generation model converts the pseudo-code into executable code. To ensure a comprehensive understanding of algorithms, it is necessary to devise methods for generating corresponding text descriptions. The study conducted by Sagarika Ghosh et al. [11] aligns algorithms in various forms, such as pseudocode and hand-drawn flowcharts, with textual explanation. The researchers proposed rules for generating pseudocode from hand-drawn flowcharts and a transfer learning method based on S-DistilBERT to find the similarity score between different forms of algorithms and their text descriptions. Block and line identification, along with OCR were used to generate pseudo codes from hand-drawn flowcharts. Experimental results indicate an 85% success rate in generating equivalent pseudocode. Their fine-tuned S-DistilBERT model achieved accuracies of 75.59% for matching existing pseudocode and 74.57% for generated pseudocode with their corresponding textual descriptions. The rules devised by the researchers have been found to be suitable only for non-recursive flowcharts. In the research done byXiang-Hu Wu et al. [12 ], they proposed a structure identification algorithm for structured flowcharts, verified for correctness using enumeration iteration. An automatic code generation algorithm was also introduced, which was validated through enumeration iteration. Finally, an integrated development platform was developed utilizing these algorithms, and incorporating flowchart modeling, code automatic generation, and support for CDT\GCC\GDB. The effectiveness of the proposed algorithms were evaluated through practical implementation.