Predicted distribution of suitable habitat
We used presence only data analyzed under a maximum entropy approach to develop present day ecological niche models (ENMs). Our goal was to evaluate the predicted distribution of Gonipterus platensis in its introduced range throughout South America, with a focus on Ecuador. We used climate data from 19 WorldClim variables summarizing temperature and precipitation features (Fick & Hijmans, 2017) and elevation. Environmental data was trimmed to the regional extent of South America using the R package raster (Hijmans, 2023). The choice of environmental background can influence the predictive ability in ENM (Elith et al., 2010). Therefore, we created a background extent to calibrate the ENM by generating a buffer of 500 km around each observed locality of G. platensis , and sampling 10 thousand random points within that environmental extent. Final models were then projected onto the regional extent of South America.
ENMs were generated using Maxent v3.4.1; this method is widely used and shows high predictive performance compared to other modeling methods (Elith et al., 2006; Phillips et al., 2006). Species localities were randomly partitioned into 75% training and 25% testing datasets, and model calibration followed a cross-validation approach with k = 5. We evaluated a range of regularization values from 1–5 and combinations of up to four feature classes (i.e., L, Q, H, LQ, LQH, and LQHP) in the R package ENMeval2.0 (Kass et al., 2021). The best tuning parameters for modeling were then selected using Akaike Information Criterion (AIC; Appendix, Table A1). Maxent uses regularization to reduce model complexity and included variables contribute differentially to the final model (Phillips & Dudík, 2008). Thus, we included all 19 WorldClim variables and elevation in the model and allowed the algorithm to converge onto the variables with the greatest contribution. The final model was calibrated using the background extent and the best tuning parameters (i.e., fc = LQH and rm = 2) and was projected on South America and Ecuador. This approach allowed us to evaluate the predicted distribution of G. platensis across the introduced range.
Model performance was assessed using the area under the receiving operating characteristic curve (AUC). AUC is a threshold-independent measure that varies from 0 to 1, where a score of 1 represents perfect discrimination and a score of 0.5 represents a model no better than random (Peterson et al., 2011). We considered an AUC score greater than 0.7 to represent good model predictions (Peterson et al., 2011). Given that AUC has been deemed unreliable for estimating performance of presence-background models (e.g., Lobo et al., 2008) we separately calculated the Boyce Index (BI) to assess model prediction in the R package ecospat (Di Cola et al., 2017; Hirzel et al., 2006). The BI uses a Spearman rank correlation coefficient, which varies from -1 to 1 (Hirzel et al., 2006). A positive BI value approaching one indicates that model predictions are consistent with the evaluation dataset, zero indicates random performance, and negative values indicate a poor match with the evaluation dataset (Hirzel et al., 2006).
Because G. platensis is invasive in South America, the final projected model implemented a lowest presence threshold of 95% (LPT95, equivalent to the Minimum Training Presence threshold) obtained from the model estimated by the Maxent cloglog output (Soto-Centeno & Steadman, 2015). Under this rule, prediction pixels with equal or higher values than the LPT95 were scored as suitable conditions where G. platensis could sustain viable populations in the introduced range. We chose LPT95 to provide a conservative prediction where model datasets contained at least 95% of locality points within suitable habitat (i.e., a theoretical expectation of 5% omission rate of the training data; Pearson et al., 2007). This threshold also helped us determine visually if our ENMs allowed enough sensitivity to examine novel areas of environmental suitability where G. platensis could establish populations in South America.