Rank metrics quantify machine learning model prediction errors in the setting of binary classification with a given dataset, but imbalances in the class distribution or model predictions can significantly influence the interpretation. We analyze and interpret the area under the receiver operating characteristic curve (auROC) and an area under the precision recall curve, or average precision score (AP), in the presence of two kinds of imbalance: dataset class imbalance and model prediction imbalance. We show that the auROC can have a symmetric but not necessarily uni-modal response to dataset imbalance, but it is insensitive to model prediction imbalance. The AP has an asymmetric response that we show is also not uni-modal. The AP and auROC scores can be challenging to interpret due to these properties. We propose a novel precision recall space, the PR*, and associated average precision score, called AP*, to address the identified shortcomings of the AP and auROC. The proposed AP* metric has a symmetric response to class imbalance, like the auROC, a sensitivity to model prediction imbalance, like the AP, and it adds both a symmetry to model prediction imbalance and unimodal response to class imbalance that makes it more reliable for model selection than the AP or auROC. For model selection, the AP* gives a higher score to the model that tends to make most of its errors in the majority class.