Federated Learning (FL) in load forecasting improves predictive accuracy by leveraging data from distributed load networks while preserving data privacy. However, the heterogeneous nature of smart grid load forecasting poses significant challenges to conventional FL methods, which are also unsuitable for resource-constrained devices. To address these issues, we propose a novel Lightweight Single Layer Aggregation (LSLA) framework tailored for smart grid networks. The LSLA framework mitigates the issue of data heterogeneity in load forecasting by emphasizing local learning and partially using updates from local devices for model aggregation. Additionally, the framework is optimized for resource-constrained devices by introducing a stopping criterion during model training and weight quantization. Our results show that after quantization, there is an acceptable degradation of 0.01% in Mean Absolute Error (MAE). Compared to traditional FL, up to a 1000fold reduction in communication overhead is achieved with LSLA. Moreover, with an 8-bit fixed-point data representation of neural network weights, a 75% reduction in storage/memory requirement is achieved, and computational cost is reduced due to the replacement of complex floating-point computational units with simpler fixed-point counterparts. By addressing data heterogeneity and minimizing data storage, computation, and communication overheads, our novel framework is well-suited for resource-constrained devices in smart grid networks.