In recent years, advancements in attention mechanisms and residual networks have significantly increased their application in facial expression classification. However, challenges such as poor key feature extraction and complex model training still exist. To tackle these challenges, this paper introduces a classification algorithm based on improved attention mechanism and residual network. Initially, ResNet50 serves as the backbone network for feature extraction, while the Convolutional Block Attention Module is incorporated to automatically learn and selectively emphasize crucial local features of the input data. Secondly, the residual modules of the backbone network are innovatively constructed to enhance the overall feature extraction effect. Finally, the improved CBAM-ERF, which includes enhancements to the CAM, is incorporated to address the issue of neuron suppression within intervals, thereby accelerating the network’s convergence speed and improving classification efficiency. We conducted experiments using three publicly available facial expression datasets: FER2013, CK+, and RAFDB. Compared to basic methods, the average accuracy increased by 13.04%, 25.67%, and 7.53%, respectively. This method can produce competitive recognition results, demonstrating its effectiveness in facial expression recognition tasks.