Tarun Agrawal

and 2 more

Surface runoff and infiltrated water interact with dynamic landscape properties en route to the stream, ranging from vegetation and microbial activities to soil and geological attributes. Stream solute concentrations are highly variable and interconnected due to these interactions, flow paths, and residence times, and often exhibit hysteresis with flow. Significant unknowns remain about how point measurements of stream solute chemistry reflect interdependent hydrobiogeochemical and physical processes, and how signatures are encapsulated as nonlinear dynamical relationships between variables. We take a machine learning approach to understand and capture these dynamical relationships and improve predictions of solutes at short and long time scales. We introduce a physical process-based ”flow-gate” into an LSTM (long short-term memory) model, which enables the model to learn hysteresis behaviors if they exist. Further, we use information-theoretic metrics to detect how solutes are interdependent, and iteratively select source solutes that best predict a given target solute concentration. The ”flow-gate LSTM” model improves model predictions (RSME values decrease from 1% to 32%) relative to the standard LSTM model for all nine solutes included in the study. The predictive improvements from the flow-gate LSTM model highlight the importance of lagged concentration and discharge relationships for certain solutes. It also indicates a potential limitation in the traditional LSTM model approach since flow rates are always provided as input sources, but this information is not fully utilized. This work provides a starting point for a predictive understanding of geochemical interdependencies using machine-learning approaches and highlights potential improvements in model architecture.