In recent years, machine learning (ML) models have been used for improving physical parameterizations of general circulation models (GCMs). A significant challenge of integrating ML models into GCMs is the online instability when they are coupled for long-term simulation. In this study, we present a new strategy that demonstrates robust online stability when the entire physical parameterization package of a GCM is replaced by a deep ML algorithm. The method uses a multistep training scheme of the machine learning model with experience replay in which the memory of physical tendencies from the training dataset and the ML algorithm’s own output at the previous time step are used in the training. The physics memory improves the accuracy of the machine learning model, while the experience replay constrains the amplification of cumulative errors in the online coupling. The method is used to train the whole physical parameterization package for the Community Atmosphere Model version 5 (CAM5) with data from its Multi-scale Modeling Framework (MMF) high resolution simulations. Three 6-year online simulations of the CAM5 with the ML physics package at operational spatial resolution with real-world geography are presented. The simulated spatial distributions of precipitation, surface temperature and zonally averaged atmospheric fields demonstrate overall better accuracy than that of the standard CAM5 and benchmark model even without the use of additional physical constraints or tuning. This work is the first to demonstrate a solution to address the online instability problem in climate modeling with ML physics by using experience replay.