Can unsupervised profile classification help create interpretable and
robust oceanographic knowledge?
Abstract
Oceanographic structure is often represented as a collection of vertical
profiles, i.e. temperature, salinity, and/or biogeochemical values at
various depths. These profiles contain information about water mass
structures and the boundaries between them, which are consequences of
the integrated effects of water mass formation, advection, and
destruction. In recent years, researchers have applied various
unsupervised profile classification methods in an attempt to identify a
set of “profile types” and the spatially coherent regimes associated
with them. These efforts have identified a number of regimes that are
consistent with existing oceanographic knowledge, and they have also
identified previously under-appreciated structural differences. However,
as this application area matures, questions remain about the strengths
and limitations of these methods as applied to oceanography. A key
question is “under what circumstances does unsupervised profile
classification produce interpretable and scientifically useful
knowledge?” Here, I explore the mechanisms and parameters of various
unsupervised learning approaches, in particular Gaussian Mixture
Modeling, in an attempt to clarify the conditions under which
unsupervised learning produces robust, interpretable, and trustworthy
understanding. As with pattern classification approaches in general,
there is a tradeoff between interpretability and accuracy (the ability
of the method to represent the full underlying structure of the system).
As a case study, I explore an unsupervised profile classification
application in the Weddell Gyre. I show that, using a combination of
statistical guidance, expert judgment, and traditional oceanographic
analysis, we can, in some cases, increase the interpretability of a
profile classification model with acceptable losses in accuracy. The
goal is to elucidate the conditions under which unsupervised learning
can be fully integrated into the oceanographic knowledge generation
process, both by confronting existing understanding and by highlighting
new avenues for exploration.