Machine learning (ML) is increasingly considered the solution to environmental problems where only limited or no physico-chemical process understanding is available. But when there is a need to provide support for high-stake decisions, where the ability to explain possible solutions is key to their acceptability and legitimacy, ML can come short. Here, we develop a method, rooted in formal sensitivity analysis (SA), that can detect the primary controls on the outputs of ML models. Unlike many common methods for explainable artificial intelligence (XAI), this method can account for complex multi-variate distributional properties of the input-output data, commonly observed with environmental systems. We apply this approach to a suite of ML models that are developed to predict various water quality variables in a pilot-scale experimental pit lake. A critical finding is that subtle alterations in the design of an ML model (such as variations in random seed for initialization, functional class, hyperparameters, or data splitting) can lead to entirely different representational interpretations of the dependence of the outputs on explanatory inputs. Further, models based on different ML families (decision trees, connectionists, or kernels) seem to focus on different aspects of the information provided by data, although displaying similar levels of predictive power. Overall, this underscores the importance of employing ensembles of ML models when explanatory power is sought. Not doing so may compromise the ability of the analysis to deliver robust and reliable predictions, especially when generalizing to conditions beyond the training data.