Model Performance and Environmental Novelty Analysis
We evaluated each model’s forecasting ability during a period of
multiple MHWs in the NEP by training each model on the full datasets
from 1995 - 2013 and testing their performance at monthly time steps on
the held-out data from 2014 - 2019 (Figure 2). Model performance was
assessed across two different dimensions: predictive skill andecological realism . Predictive skill denotes the model’s
ability to accurately classify locations where species were present from
those where a species was absent on independent occurrence data. We
quantified predictive skill using two metrics: the Area Under the
receiver-operating Curve (AUC) and Mean Absolute Error (MAE). Using AUC
and MAE has been recommended for assessing SDMs, as they provide
complementary insights on model performance while addressing
shortcomings inherent in each metric’s assumptions (Konowalik and Nosol
2021). In addition to predictive skill, we evaluated ecological
realism , which considers a model’s capacity to estimate biologically
plausible species-environment relationships and predict spatiotemporal
patterns consistent with observed ecological processes. This was
assessed qualitatively by 1) comparing spatial predictions for a
forecasted month to the known distribution of albacore during a MHW and
2) analyzing partial response curves to determine whether they aligned
with observed environmental covariate distributions during the training
(1995–2013) and forecasting (2014–2019) periods.
Beyond assessing overall model performance, we evaluated how effectively
each model type handled environmental novelty using Hellinger Distance
(Legendre and Legendre 2012, Johnson and Watson 2021, Karp et al. 2023),
which measures the difference between two probability distributions (see
Karp et al. 2023 for formulas). We calculated Hellinger Distance for
both dynamic environmental covariates (SST and MLD) of each month-year
in the test data relative to the climatological conditions of the same
month in the training data. Hellinger Distance quantifies the extent of
extrapolation required by the fitted SDM when making predictions and
ranges from 0 to 1, where a value of 0 indicates that the two
distributions share the same information (e.g., complete data overlap),
while values of >0.5 indicate greater dissimilarity than
similarity between the two distributions (Johnson and Watson 2021).
Finally, we assessed the impact of environmental novelty on forecast
skill by comparing the relationship between AUC and MAE with Hellinger
Distance across the different models.