BiAttentionNet: A dual-branch automatic driving image segmentation
network integrating spatial and channel attention mechanisms
Abstract
[1](#fn-0002) Abstract—Real-time semantic
segmentation is one of the most researched areas in the field of
computer vision, and research on dual-branch networks has gradually
become a popular direction in network architecture research. In this
paper, we propose a dual-branch automatic driving image segmentation
network integrating spatial and channel attention mechanisms, named
”BiAttentionNet”. The network aims to balance network accuracy and
real-time performance by separately processing semantic information in
high-level features and detail information in low-level features.
BiAttentionNet consists of three main parts: the detail branch, the
semantic branch, and the proposed attention-guided fusion layer. The
detail branch extracts local and surrounding context features using the
designed PCSD convolution module to process wide-channel low-level
feature information. The semantic branch utilizes an improved
lightweight Unet network to extract semantic information from deep
narrow channels. Finally, the proposed attention-guided fusion layer
fuses the features of the dual branches using detail attention and
channel attention mechanisms to achieve image segmentation tasks in road
scenes. Comparative experiments with recent mainstream networks such as
BiseNet v2, Fast-SCNN, ConvNeXt, SegNeXt, Segformer, CGNet, etc., on the
Cityscapes dataset show that BiAttentionNet achieves a highest accuracy
of 65.89% in terms of mIoU metric for the backbone network. This
validates the advanced and effective nature of the proposed
BiAttentionNet.