Multimodal image fusion has become crucial for object detection, offer- ing enhanced feature representations by integrating information from diverse image modalities, such as RGB and thermal images. Recent advances in neural networks, including convolutional neural networks (CNNs) and Transformer-based approaches, have achieved substantial progress in this area. However, existing methods often struggle to fully integrate complementary information across modalities, particularly in enabling activated fusion over varying regions and scales. To address these limitations, we propose the Dynamic Channel Adjustment and Multi-Scale Activated Attention Mechanism Network (DAMAN). This model improves inter-modal feature integration and strengthens spa- tial and contextual information capture. Extensive experiments demon- strate DAMAN’s superior adaptability to objects of varying sizes and its robustness in complex traffic and industrial environments. Code and model checkpoints will be released following peer review.