The goal of infrared and visible image fusion is to combine complementary information from the source images. However, owing to the absence of ground truth, most fusion algorithms just utilize information from source images, which cannot provide targeted guidance for network learning, leading to suboptimal outcomes. Moreover, numerous methods focus solely on modifying network architecture to improve fusion performance, without optimizing fusion algorithms from other perspectives. To tackle these problems, we propose a real-time end-to-end infrared and visible image fusion network based on adaptive guidance module (AGMFusion). On the one hand, by combining state-of-the-art methods and best fusion outcomes during the training process, the adaptive guidance module effectively facilitates collaborative guidance for network training. Additionally, based on the adaptive guidance module, we devise a loss function that includes content loss and guidance loss. We further balance these two loss components with the adaptive weight to boost the performance of our framework. On the other hand, AGMFusion is a lightweight image fusion framework that is able to generate highly perceptual fused images while maintaining excellent real-time performance, which can be potentially deployed as a pre-processing unit for various vision tasks. According to extensive comparative and generalization experiments, AGMFusion exceeds existing methods in visual appeal and quantitative metrics. Importantly, performance comparisons of numerous frameworks in running efficiency and object detection highlight the advantages of our approach. The source code will be released at https://github.com/liushh39/AGMFusion.