Abstract
The Siamese tracker consists of two components: a classification and a
regression networks. Despite their different roles, most Siamese
trackers have similar feature fusion modules in the two networks,
leading to the neglect of their unique characteristics. In this work, we
experimentally discover that the two networks place different levels of
emphasis on different types of information. Specifically, regression
tends to rely on semantic information, while classification places more
emphasis on global information. Therefore, we propose a new tracking
structure named SGTrack, which includes a semantic augmentation fusion
(SAF) for regression and a global relevance fusion (GRF) for
classification. It allows us to unlock the full potential of both
networks. The experimental results of our method on five benchmarks
provide evidence of a notable improvement in tracking performance, while
preserving real-time speed.