2.3 Spatial overlap analysis using habitat suitability modeling
To predict suitable cottontail habitat across Connecticut, we ran habitat suitability models in Maxent 3.4.4 (Phillips et al. 2020) for New England and eastern cottontail separately. Maxent is a machine-learning technique that estimates the unknown probability distribution of a species across both geographic and environmental space using predictors and presence locations, contrasted with the overall distribution within a user-defined landscape (Phillips et al. 2006). We used cottontail presence data and 15 environmental predictors (14 continuous and 1 categorical predictor), to predict suitable habitat. We used the state of Connecticut as the background extent for the Maxent models due to the availability of fine-scale spatial data that differentiated vegetation types and spatial patterns (Rittenhouse et al. 2022, Yang et al. 2023), recognizing that Connecticut is a portion of each of the species’ range. Previous studies using habitat suitability models have also limited the spatial extent to a portion of a species’ range when modeling rare species or when specific areas are of interest (Warren et al. 2008, Lioy et al. 2023, Santamarina et al. 2023).
We used the Maxent default settings of randomly selecting 10,000 background locations from the entire study extent and regularization parameter of 1.0 to balance model complexity and fit (Elith et al. 2011). We used the auto features function to evaluate linear, quadratic, product, hinge, and categorical features based on the number of presence records for each species (Phillips et al. 2017). We chose the logistic output within Maxent, a logistic transformation of the raw maximum entropy values dependent on a prevalence value or the probability a species was present at sites with average conditions (τ; Guillera-Arroita et al. 2014), to report values in terms of relative habitat suitability (Elith et al. 2011). The default τ value in Maxent, 0.50, is only suitable for species with similar prevalence values, otherwise the model error increases, changing the τ value to be more suitable for the species being studied is recommended (Guillera-Arroita et al. 2014). Occupancy models at the landscape scale found New England cottontail occupancy probability was lower and eastern cottontail occupancy probability was higher than 0.50, thus we ran three sets of models for each species so τ would reflect the mean (τ = 0.25 for New England cottontail, τ = 0.79 for eastern cottontail), lower 95% credible interval (τ = 0.16 for New England cottontail, τ = 0.69 for eastern cottontail) and upper 95% credible interval (τ = 0.36 for New England cottontail, τ = 0.88 for eastern cottontail) values found in Bischoff et al. (2023a ). Specific values reported in the results were from the mean τ value model runs, but the output maps were an average of all τ values for each species.
We ran a 10-fold cross validation of the model for each τ value and species (6 models total, 3 models for New England cottontail and 3 models for EC) to produce error around model estimates and evaluate model performance. We used cross-validation over selecting training and test data so all the data could be used for model validation (Phillips et al. 2017) and to incorporate randomness into the testing and training data that matches the randomness of the background data (Elith et al. 2011). We used area under the receiver operating characteristic curve (AUC) to evaluate the performance of the Maxent models.
To find areas of high New England cottontail habitat suitability without high eastern cottontail habitat suitability, we averaged the three model outputs for each species and extracted the top 25% of suitability values for each species to map high habitat suitability. We overlayed these high habitat suitability outputs for both species and identified areas that had high habitat suitability for New England cottontail without high eastern cottontail habitat suitability.