Deep Neural Network (DNN) belongs to an important class of machine learning algorithms generally used to classify digital data in the form of image and speech recognition. The computational complexity of a DNN-based image classifier is higher than traditional fully connected (FC) feed-forward NNs. Therefore, dedicated cloud servers and Graphical Processor Units (GPU) are utilized to achieve high-speed and large-capacity computation tasks in machine vision systems. However, a growing demand exists for real-time processing of complex machine-learning tasks on embedded systems. As FC layers consume the highest fraction of computational power and memory footprint, innovating novel power-efficient and low-footprint NN architecture for embedded systems is crucial. A novel design strategy and algorithms are proposed in this article, where a power-efficient FC DNN is implemented using a pipelined and parallel Fast Fourier Transform (FFT) on a circular projection-based architecture. The footprint of the DNN is further reduced using a folded FFT network. The proposed algorithm is tested using two benchmark training set examples, the “MNIST database of handwritten digits” and the “CIFAR-10 database”. In both cases, we achieved > 90% accuracy, while the power consumption of the network is 37% less than the traditional FFT architecture-based DNNs.