Large-scale classification poses great challenges to storage and computation. There are two major solutions to address the problem: data compression and quantization. In the paper, we study the method of first reducing data dimension by random projection and then quantizing the projections to ternary and binary codes, which has been widely applied in practice. Often, the extreme quantization would degrade the accuracy of classification due to high quantization errors. Interestingly, however, we observe that the quantization could result in performance improvement, rather than degradation, if the data for quantization are preprocessed by sparse transform. Also, the quantization gain could be obtained with the random projections of the data, if both the data and random projection matrices are sparse enough, such that the resulting projections remain sparse. The intriguing performance is verified and analyzed with extensive experiments.