2.3 Spatial overlap analysis using habitat suitability modeling
To predict suitable cottontail habitat across Connecticut, we ran
habitat suitability models in Maxent 3.4.4 (Phillips et al. 2020) for
New England and eastern cottontail separately. Maxent is a
machine-learning technique that estimates the unknown probability
distribution of a species across both geographic and environmental space
using predictors and presence locations, contrasted with the overall
distribution within a user-defined landscape (Phillips et al. 2006). We
used cottontail presence data and 15 environmental predictors (14
continuous and 1 categorical predictor), to predict suitable habitat. We
used the state of Connecticut as the background extent for the Maxent
models due to the availability of fine-scale spatial data that
differentiated vegetation types and spatial patterns (Rittenhouse et al.
2022, Yang et al. 2023), recognizing that Connecticut is a portion of
each of the species’ range. Previous studies using habitat suitability
models have also limited the spatial extent to a portion of a species’
range when modeling rare species or when specific areas are of interest
(Warren et al. 2008, Lioy et al. 2023, Santamarina et al. 2023).
We used the Maxent default settings of randomly selecting 10,000
background locations from the entire study extent and regularization
parameter of 1.0 to balance model complexity and fit (Elith et al.
2011). We used the auto features function to evaluate linear, quadratic,
product, hinge, and categorical features based on the number of presence
records for each species (Phillips et al. 2017). We chose the logistic
output within Maxent, a logistic transformation of the raw maximum
entropy values dependent on a prevalence value or the probability a
species was present at sites with average conditions (τ;
Guillera-Arroita et al. 2014), to report values in terms of relative
habitat suitability (Elith et al. 2011). The default τ value in Maxent,
0.50, is only suitable for species with similar prevalence values,
otherwise the model error increases, changing the τ value to be more
suitable for the species being studied is recommended (Guillera-Arroita
et al. 2014). Occupancy models at the landscape scale found New England
cottontail occupancy probability was lower and eastern cottontail
occupancy probability was higher than 0.50, thus we ran three sets of
models for each species so τ would reflect the mean (τ = 0.25 for New
England cottontail, τ = 0.79 for eastern cottontail), lower 95%
credible interval (τ = 0.16 for New England cottontail, τ = 0.69 for
eastern cottontail) and upper 95% credible interval (τ = 0.36 for New
England cottontail, τ = 0.88 for eastern cottontail) values found in
Bischoff et al. (2023a ). Specific values reported in the results
were from the mean τ value model runs, but the output maps were an
average of all τ values for each species.
We ran a 10-fold cross validation of the model for each τ value and
species (6 models total, 3 models for New England cottontail and 3
models for EC) to produce error around model estimates and evaluate
model performance. We used cross-validation over selecting training and
test data so all the data could be used for model validation (Phillips
et al. 2017) and to incorporate randomness into the testing and training
data that matches the randomness of the background data (Elith et al.
2011). We used area under the receiver operating characteristic curve
(AUC) to evaluate the performance of the Maxent models.
To find areas of high New England cottontail habitat suitability without
high eastern cottontail habitat suitability, we averaged the three model
outputs for each species and extracted the top 25% of suitability
values for each species to map high habitat suitability. We overlayed
these high habitat suitability outputs for both species and identified
areas that had high habitat suitability for New England cottontail
without high eastern cottontail habitat suitability.