Sea surface temperature (SST) is a fundamental parameter in the field of oceanography as it significantly influences various physical, chemical, and biological processes within the marine environment. In this study, we propose an Attention-based Context Fusion Network (ACFN) model for short-term prediction of SST based on the Operational SST and Sea Ice Analysis (OSTIA) data. The ACFN model combines an attention-based context fusion block with the Convolutional Long Short-Term Memory (ConvLSTM) model, enabling the exploration of intricate spatiotemporal correlations between the previous context state and the current input state in ConvLSTM. To assess the performance of the ACFN model, we apply it to predict SST in the Bohai Sea over lead times spanning from 1 to 10 days. The results demonstrate that our proposed model outperforms several state-of-the-art models, i.e., ConvLSTM, PredRNN, and MoDeRNN, in terms of mean absolute error and coefficient of determination. In particular, our analysis reveals that the prediction errors near the coastal areas exhibit relatively higher values compared to those in the central Bohai Sea.