DISCUSSION
Avian surveys using ARUs can overcome major limitations experienced by point count methods. In our study system, these include limitations associated with remote, difficult terrain and late snowmelt as well as the disruption of surveys due to inclement weather. Such advantages potentially make ARUs a powerful substitute for point counts (e.g. Darras et al., 2019). Our results here, however, indicate that ARUs should be augmented by point counts: dual methods allowed us to identify detection differences between methods where they were not anticipated. In this specific case, performance differences are likely attributable to differences in community composition between regions (as we discuss below). More generally however, our results show how dual methods enable monitoring programs to flag detection issues associated with survey method and thus enhance comparisons across habitat types and ecosystems.
High mountain habitats in BC and Chile are structurally similar, yet ARU performance was markedly better in BC than in Chile. This illustrates that avian community composition can matter as much as habitat composition in influencing method performance. As in Klingbeil and Willig (2015), we believe differences in detection probability that favour point counts in Chile are largely due to visual identification of species rather than audio detection. Raptor diversity is higher in Chile than BC and this largely silent group is best monitored by point counts. ARUs missed 6 raptor species that were picked up by point counts (Table S1). Similarly, Tyrannidae) rarely vocalize: the Xeno-canto Foundation notes that, of all neotropical genera, ground-tyrants and shrike-tyrants are the difficult to record. 5/9 tyrant species recorded in this study were missed by ARUs. Changes in vocalization frequency may also drive the seasonal variation in ARU detectability observed for 5/11 families in Chile. Song activity likely wanes when females are incubating or when pairs are feeding young (Moussus et al., 2009); yet, these individuals may remain visible during point counts when foraging. Interestingly, seasonal variation in detection probability was not supported for any family in BC.
ARUs provide the ability to re-play audio in order to capture all calls and confirm species identity. In contrast, point counts are more vulnerable to observer effects: individuals at point counts may miss species because they subconsciously screen out certain calls (“window species”; Kepler & Scott, 1981), are overwhelmed with the number of calling species (Celis-Murillo et al., 2009; Hutto & Stutzman, 2009), or because they mis-identify difficult calls (Bart, 1985; Celis-Murillo et al., 2009). This may explain why ARUs perform well in the species-rich upper montane (Fig. 1), and why a single ARU count/site in BC detected more species than a single point count/site, despite observation effort being equivalent (6 min/site; Fig. 2A and Fig. 3A). Two alternative explanations - that ARUs capture species’ peak activity because they sample a broader period of the morning, or that ARUs fail to screen out songs originating outside of their focal habitat and therefore overstate species diversity - were not well supported by our data. First, richness by hour showed no evidence of a peak in BC (Fig. 1A). Neither was there an ARU detection peak over the morning within-families (Fig. S2). Warblers, thrushes and kinglets were all, however, less likely to be detected by point counts later in the morning, pointing toward observer bias in point counts (Fig. S2). Secondly, as vocalizations tend to carry upslope, we would expect ARUs near habitat transition zones to mis-assign species to higher elevation habitats. Instead, ARUs in BC detected greater species diversity than point counts in upper montane habitat, not in the subalpine or alpine (Fig. 1A).
The ability to collect large amounts of data from ARUs is one of their advantages and, because the collection process itself is cheap, there is a temptation to obtain as much data as possible. However, the added time cost per sample associated with processing ARU data, when compared to point count surveys, needs to be carefully considered when planning monitoring protocols. Advances in automated processing may change this calculation (e.g. Knight et al., 2020), but additional time costs associated with training algorithms and proofing output still need to be considered (Joshi et al., 2017; Knight et al., 2017). Where ARUs perform poorly, as in the mountains of southern Chile, repeated sampling does not improve survey coverage (Fig. 2B). In other words, ARUs, like point counts, may miss large portions of communities regardless of effort. Programs should ascertain if this is the case before investing in increased ARU sampling. In this study, increased effort involved increased sampling within-day: it is possible that sampling more days, with lower effort within-day, would yield better returns. Detections of four nocturnal species in dawn ARU recordings highlight the benefit of synchronous sampling across survey sites.
Our work aligns with smaller studies that conclude dual methods are advantageous across a range of habitats (Celis-Murillo et al., 2009; 2012 (in specific cases); Tegeler et al., 2012; Alquezar & Machado, 2015; Vold et al., 2017), as well as two larger studies within temperate and boreal forest (Holmes et al., 2014; Van Wilgenburg et al., 2017). Our comparison across structurally similar habitats in different geographic regions highlights the importance of the avian community, in addition to habitat, in impacting method performance. We additionally show that the benefit-to-time-cost ratio of dual methods that employ 1-2 point counts/site is comparable or better than single-method approaches. Because our study system has relatively low species richness, our time costs for ARU transcription is relatively short. Where ARU processing is more time consuming, the benefits of employing dual methods should be more pronounced.