Feature fusion is an effective solution for improving image retrieval performance. Although the more feature types, the better accuracy, complexity also increases. Applications in practice typically afford a limited number of feature types. Due to the strong complementarity, global and local features form an ideal combination for many fusion applications. However, the two kinds of features are intrinsically different in nature, thus cannot be fused in a straightforward way. In this work, we propose an integrated image retrieval and feature fusion framework for global and local features. It is based on inverted index fusion, a technique for efficient image retrieval. The core idea is to rank candidates by weighted voting during candidate selection, which is named pre-ranking. This procedure takes place before re-ranking, and is potentially superior to conventional late fusion. Extensive experiments on three public datasets show that the light-weight pre-ranking stage significantly contributes to accuracy, and brings substantial improvement when used together with re-ranking. Our method is robust and versatile, and can be applied to any scenario where inverted indexing is used. It is a promising technique for multimedia retrieval in the big data era.