Edge clouds (ECs) have recently been shown with outstanding advantages in enhancing customized user service experiences, benefiting from user proximity and location-aware characteristics. However, operating a large-scale EC network would inevitably result in a significant energy cost for EC providers, potentially offsetting their service revenue without proper energy cost management. In this paper, we focus on conserving energy cost for EC providers by leveraging both electricity price-aware geographical load balancing and dynamic central processing unit (CPU) provisioning, considering the spatio-temporal diversities of electricity prices and user task demands. Due to the significant “switching cost” associated with turning CPUs and services on/off, we formulate a multi-timescale energy cost minimization problem that integrates large-timescale CPU provisioning and service placement, as well as small-timescale geographical task dispatching and CPU resource allocation. The Lagrange dual decomposition theory is exploited to handle the spatio-temporal variable couplings. A fully distributed mini-batch learning (MBL) algorithm that relies on parameter approximation for large-timescale decision makings is proposed to learn the optimal dual variables, i.e., the Lagrange multipliers. We present rigorous algorithm performance analysis, and conduct extensive simulations based on real data of electricity prices of Canada to demonstrate the superior performance of the MBL algorithm compared to several baseline schemes.