loading page

Multimodal Image Fusion for Object Detection via Dynamic Channel Adjustment and Multi-Scale Activated Attention Mechanism Network
  • Yihang Ye,
  • Mingxuan Chen,
  • Anjie Wang
Yihang Ye
Shanghai University of Engineering Science
Author Profile
Mingxuan Chen
Shanghai University of Engineering Science

Corresponding Author:[email protected]

Author Profile
Anjie Wang
Peking University
Author Profile

Abstract

Multimodal image fusion has become crucial for object detection, offer- ing enhanced feature representations by integrating information from diverse image modalities, such as RGB and thermal images. Recent advances in neural networks, including convolutional neural networks (CNNs) and Transformer-based approaches, have achieved substantial progress in this area. However, existing methods often struggle to fully integrate complementary information across modalities, particularly in enabling activated fusion over varying regions and scales. To address these limitations, we propose the Dynamic Channel Adjustment and Multi-Scale Activated Attention Mechanism Network (DAMAN). This model improves inter-modal feature integration and strengthens spa- tial and contextual information capture. Extensive experiments demon- strate DAMAN’s superior adaptability to objects of varying sizes and its robustness in complex traffic and industrial environments. Code and model checkpoints will be released following peer review.
12 Nov 2024Submitted to Electronics Letters
13 Nov 2024Submission Checks Completed
13 Nov 2024Assigned to Editor
13 Nov 2024Review(s) Completed, Editorial Evaluation Pending
15 Nov 2024Reviewer(s) Assigned