Yi Jie Wong

and 5 more

Deep learning has significantly advanced the field of building extraction from remote sensing images, providing robust solutions for identifying and delineating building footprints. However, a major challenge persists in the form of domain adaptation, particularly when addressing cross-city variations. The primary challenge lies in the significant differences in building appearances across cities, influenced by variations in building shapes and environmental characteristics. Consequently, models trained on data from one city often struggle to accurately identify buildings in another city. In this paper, we address this challenge from a data-centric perspective, focusing on diversifying the training set. Our empirical results show that improving data diversity via opensource datasets and diffusion augmentation significantly improved the performance of the segmentation model. Our baseline model, trained with no extra dataset, only achieved a private F1 score of 0.663. On the other hand, our model trained with the additional Las Vegas building footprints extracted from the Microsoft Building Footprint dataset, achieved a high private F1 score of 0.703. Surprisingly, we found that diffusion augmentation helps improve our model score to 0.681 without requiring an extra dataset, which is higher than the baseline model. Finally, we also experimented with the Non-Maximal Suppression (NMS) hyperparameter to improve the model's performance in segmenting dense and small objects, which gave us a high private F1 score of 0.897. These techniques ultimately led our solution to rank 1st in the competition. Our source code and the pretrained models are publicly available at https://github.com/DoubleY-BEGC2024/OurSolution.