Modeling methods
Three methods were used for building models: boosted regression trees
(BRT), generalized linear models of the binomial family (GLM) and
maximum entropy (Maxent) (Hijmans & Graham, 2006; Elith, Leathwick, &
Hastie, 2008; Franklin, 2010). Models used field survey presence and 200
randomly generated pseudoabsence background points (Franklin, 1995;
Franklin, 2010; Elith, Kearney, & Phillips, 2010; Elith & Franklin,
2013; Phillips & Elith, 2013, Guillera-Arroita et al., 2015).
BRT is an iterative machine learning optimization method, in which the
deviance residuals from a prior decision tree are used as the data for
the next step (called “boosting”); the decision tree building process
continues until residual deviance is no longer decreased by iterations
(De’ath, 2007; Franklin, 2010). Decision trees, the underlying algorithm
of BRT, also known as classification and regression trees, perform well
with both continuous and categorical variables, and, unlike with GLM,
for example, they are robust to a lack of independence among predictors
(De’ath, 2007; Elith & Leathwick, 2009; Elith & Leathwick, 2017;
Albuquerque et al., 2018).
GLM is a well known regression method that uses maximum likelihood as
the measure of the contribution of a variable to a prediction of the
“state” of a dependent variable, in this case the binary outcome of
presence/absence (Nelder & Wedderburn, 1972; Guisan, Edwards, &
Hastie, 2002).
Maximum entropy (Maxent) is a machine learning method that employs
multinomial logistic regression to estimate the probability of the
distribution of a species according to the “maximum entropy” of the
distribution, i.e., the most uniform distribution of a species possible
given the limits imposed by the predictor variables (Phillips, Anderson,
& Schapire, 2006; Elith et al. 2011, Phillips, Anderson, Dudík,
Schapire, & Blair, 2017).