Federated learning (FL) is a distributed machine learning (ML) technique that enables multiple, decentralized clients to collaboratively learn a shared ML model without sharing training data. However, deploying FL presents challenges related to energy efficiency, heterogeneous devices, and fault tolerance. This paper introduces novel techniques that address all these challenges almost for the first time. We propose an Adaptive Client Scheduling (ACS) algorithm to efficiently manage heterogeneous clients clustered by characteristics like processing power. Our proposed energy-efficient CNN and ACS methods are designed to reduce the computation overhead and energy consumption associated with techniques such as hyperparameter optimization and Adaptive Weight Pruning (AWP) algorithm. The results obtained from the MNIST and Cifar-10 datasets demonstrate that our proposed CNN and ACS algorithms can reduce the computation overhead and energy consumption by up to 10% and 20%, respectively, compared to baseline methods while achieving similar or better accuracy compared to baseline CNNs. Furthermore, the AWP algorithm on our CNN model could achieve higher energy-efficiency in FL, resulting in additional energy savings of up to 5% in fault-tolerant fog-edge environments. To create a fault-tolerant fog-edge environment, we utilize the task replication technique (TR) to improve reliability by 12%. TR is used to create multiple instances of a task and distribute them to different fog nodes. The combination of our proposed methods for energy-efficient CNN and ACS, along with AWP and task replication, enables highly energy-efficient and fault-tolerant FL processes.