loading page

DUCFNet: Dual U-shaped Cross-modal Fusion Network for Lung Infection Region Segmentation
  • Shangwang Liu,
  • Mengjiao Zhao
Shangwang Liu
Henan Normal University

Corresponding Author:[email protected]

Author Profile
Mengjiao Zhao
Henan Normal University
Author Profile

Abstract

To promote further development of medical image segmentation, there is an increasing demand for high-quality datasets. Regrettably, there are two major obstacles which are the difficulty of acquiring available medical images and the financial burden of data annotation for constructing high-quality datasets. To overcome the difficulties, we leverage medical text data to compensate for the defects of existing image datasets. In this work, we propose a dual U-shaped network to sufficiently achieve the cross-modal feature fusion of image and text. Specifically, one of the U-shaped branches is based on convolution neural network, named U-CNN, which mainly extracts global features of images and generate the final prediction results. The other one is based on vision transformer blocks, named U-ViT, which is responsible for processing text information and merging the text features and image features from U-CNN. Additionally, we utilize Cross-Attention Channel Fusion module and Channel-wise Dual-branch Cross Fusion module to equip the skip connection of U-CNN. And the two modules are greatly beneficial for resolving the semantic gaps and enhancing further integration of cross-modal information. Experimental results on two lung infection image datasets with different modalities (X-Ray and CT) suggest our method achieves excellent performance compared to other alternative state-of-the-art methods.
Submitted to Expert Systems
Submission Checks Completed
Assigned to Editor
Reviewer(s) Assigned