The research in discussion explores the intersection of cloud/edge computing and time-series forecasting to optimize resource utilization and reduce energy consumption in telecommunications networks. It highlights the evolution of machine learning models used for forecasting, starting from Recurrent Neural Networks (RNNs) to the more advanced Long Short-Term Memory networks (LSTMs) with attention mechanisms, and eventually to transformer architectures. The ultimate goal is to achieve precise predictions to allow smart cities and telecom networks to adapt real-time to varying demands, improving service quality and reducing operational costs. A significant focus is given to the attention mechanism, especially the sparse attention, which is seen as a potential solution to the challenges faced by transformer models in handling long sequences of data efficiently. Among the sparse attention models, the Informer model is highlighted for its promise in handling cloud/edge domain scenarios. The article also mentions providing a list of cutting-edge use cases and proof-of-concept demonstrations to substantiate the claims regarding the benefits of these advanced forecasting models in the cloud/edge domain.