Main Body:
Larsen & Shirey 2021 (ELE13731) identified significant concerns with the phenological analysis in Fric et al. 2020 (ELE13419). In response, Fric et al. 2021 (ELE13739) presented three arguments: (1) ELE13731 switched onset and termination results in the reanalysis of ELE13419 data and when corrected, the reanalysis supports the original conclusions, (2) removing data with known errors doesn’t change overall results, and (3) the benefits of including more species for life history trait analysis justifies analysing species with sparse data. These arguments fail to confront the spatiotemporal bias in the underlying data (Figure 1). ELE13739 reported that most species demonstrate later termination at higher latitudes and do not have detectable shifts in onset across latitudes. These spurious patterns resulted from coarse data aggregation and spatiotemporal variation in recorder activity (Steel et al. 2013, Isaac et al. 2014, Knape et al. 2021)
The first argument in ELE13739 is not supported by the evidence. Species-specific visualizations in ELE13731 demonstrate non-significant and positive correlations between latitude and species-specific onset, with more varied latitudinal patterns in termination (ELE13731 Figure S1). The ELE13739 reanalysis addressed data curation but ignored other ELE13731 methods; ELE13731 analytical code has been validated and results independently reproduced (Campbell and Belitz 2021).
The second argument dismissed the importance of data curation, claiming “stringent reanalysis” did not change results significantly. However, comparing glm and lmer results in ELE13739, onset and termination responses differ for 11 and 23 species/regions (of 88) respectively. Here, we quantify changes in inferred phenological trends attributable to data curation, data aggregation, phenometric estimation, and effort corrections (Supplement 1). Applying mixed effect models with random intercepts for year and species/region (“population”), we find that inferred latitudinal patterns are sensitive to methodology, particularly data aggregation, for both onset (35 populations) and termination (16 populations). Overall, we identify later onset at higher latitudes for 35 of 60 univoltine populations with annual phenometrics in at least two latitudes, and 17 of 18 populations with 10+ latitudes (Figure 2). We find no evidence of earlier onset at higher latitude for any population, refuting ELE13739 results. Latitudinal trends are rarely detected for termination metrics, which are earlier or later at higher latitude for 5 and 11 populations, respectively. Overall, different minimum data thresholds and aggregation decisions drive the significant differences between ELE13739 and ELE13731 results.
The third argument in ELE13739 presents a challenge for the research community as unstructured occurrence data become widely used. There is a trade-off between maximizing the research scope, and ensuring the quality of data input into models. Researchers must consider how variable sampling informs scope, methods, and interpretation of results, and determine when data are sufficient for analysis. These decisions are difficult; disagreement among research groups about best practices is expected. However, our analysis demonstrates that improper data aggregation and small sample size can lead to unreliable conclusions. Methodological differences are discussed further in Supplement 2.
Many factors contribute to spatiotemporal bias in occurrence data. Separating sampling factors (eg, observer density) from ecological factors (eg, longer phenological duration) is challenging; data handling and analytical decisions determine research outcomes. ELE13739 estimated phenometrics from as few as 2 occurrences, and used one pair of phenometrics for each latitudinal band in each population, removing stratification across years and elevations. Should ELE13739 results be taken at face value, the plurality of studied species experience longer flight periods at higher latitudes. This result contradicts the long-established understanding of lepidopteran thermal physiology, and does not hold up to scrutiny. ELE13731 demonstrated different phenological patterns, having aggregated within years and required 10 occurrences; this included fewer species and latitudes, but more phenometrics per latitudinal band (Supplement 1). While previous analyses of these data did not account for observer effort, the effort metrics we include rarely affect inferred phenological responses to latitude. Misleading results in ELE13739 resulted primarily from data limitations and overly coarse data aggregation.
Recent accessibility of aggregated occurrence data requires careful consideration of how to avoid spurious patterns which may arise from data biases. Spatiotemporal variation in recorder activity in aggregated occurrence data may lead to misleading results (Isaac et al. 2014, Amano et al. 2016, Larsen and Shirey 2021). To disentangle observation patterns from ecological patterns and avoid spurious results, analyses must consider underlying observation processes (Tweedie et al. 1994). Fortunately, there is a growing body of work on how to identify and mitigate issues related to observer bias in aggregated data (Isaac et al. 2014, Shirey et al. 2021, Van Eupen et al. 2021). Similarly, several approaches have been developed for estimating robust phenometrics from occurrence data (eg, Kharouba et al. 2014, Chapman et al. 2015, Inouye et al. 2019, Belitz et al. 2020, Cima et al. 2020). With the continuing development of robust methodologies, occurrence data have great potential to expand the taxonomic and spatial domains of ecological knowledge.