Finding the Root Causes of Statistical Inconsistency in Community Earth
System Model Output
Abstract
Baker et al (2015) developed the Community Earth System Model Ensemble
Consistency Test (CESM-ECT) to provide a metric for software quality
assurance by determining statistical consistency between an ensemble of
CESM outputs and new test runs. The test has proved useful for detecting
statistical difference caused by compiler bugs and errors in physical
modules. However, detection is only the necessary first step in finding
the causes of statistical difference. The CESM is a vastly complex model
comprised of millions of lines of code which is developed and maintained
by a large community of software engineers and scientists. Any root
cause analysis is correspondingly challenging. We propose a new
capability for CESM-ECT: identifying the sections of code that cause
statistical distinguishability. The first step is to discover CESM
variables that cause CESM-ECT to classify new runs as statistically
distinct, which we achieve via Randomized Logistic Regression. Next we
use a tool developed to identify CESM components that define or compute
the variables found in the first step. Finally, we employ the
application Kernel GENerator (KGEN) created in Kim et al (2016) to
detect fine-grained floating point differences. We demonstrate an
example of the procedure and advance a plan to automate this process in
our future work.