Ji Xu

and 5 more

The centrality and diversity of the labeled data are very influential to the performance of semi-supervised learning (SSL). Most existing SSL models select the labeled data randomly and equally allocate the labeling quota among the classes, leading to considerable unstableness and degeneration of performance. Active learning has been proposed to address the problem of instance selection in learning with a few labels, but its iterative and progressive procedure causes heavy computing cost. Optimal leading forest (OLF) has the advantage of revealing the difference evolution along a path within a subtree. This study unsupervisedly constructs a leading forest that forms another metric space, based on which it is convenient to select most central and divergent samples with one shot. The labeling quota can be allocated flexibly according to the data distribution. A discrete optimization problem is formulated based on the new metric space to select the samples to label. With the small number of selected instances, the kernelized large margin projection can be efficiently learned to classify the remaining unlabeled samples. The multi-modal issue in SSL is effectively addressed by the multi-granular structure of leading forest that readily facilitates multiple local metrics learning. Extensive experimental results demonstrate that the proposed method achieved competitive efficiency and encouraging accuracy when compared with the state-of-the-art methods. The code is available at https://github.com/alanxuji/DeLaLA