Robust Semantic Foggy Scene Segmentation (SFSS) is crucial for the security of autonomous driving. However, the blurring of images caused by fog increases the difficulty of recognition and makes annotation of foggy scene images more expensive, resulting poor performance when recognizing entities in the fog. Currently, many methods use domain adaptation to transfer segmentation knowledge from clear scenes to foggy ones. But these method are often ineffective due to the large domain gap between different cities’ styles and the quality degradation of images caused by fog. The latest research has attempted to introduce an intermediate domain to decouple the domain gap and gradually complete the semantic segmentation of foggy scenes, but the exploration of the intermediate domain information is often insufficient. To solve these problems, we first analyze the self-training in domain adaptation and propose the concept of “label reference value”. We prove that the higher the total label reference value, the easier the self-training performance gets improved. With this precondition, we can reasonably split the original problem into two-stage domain adaptation. In each stage, the “label reference value” can be controlled and maximized. Specifically, the first stage only process the style gap between source domain and the intermediate domain, and the second stage process the fog gap. The fog gap includes: (1) real fog gap between the intermediate domain and target domain; (2) the synthetic fog gap between clear source domain and synthetic foggy source domain. This allows the model to make full use of “label reference value” and gradually develop good semantic segmentation skills for foggy scenes. Our approach significantly outperforms the baseline algorithm on all the mainstream SFSS benchmarks, with good generalization ability demonstrated in other adverse scenes such as rain and snow. We also compare our method with latest large segmentation models, which shows that our method has more robust performance in the foggy scenes.