Detection probability by method
For species that were detected by one method only, we assessed the probability that this was due to a detection difference between methods versus chance using the Fisher’s exact test on the frequency of detection by method across all site-surveys (Fisher, 1992).
Because we were also interested in generalizable patterns of detection, we pooled species into family groups and assessed each family’s detection probability, by method, using the R package, ‘unmarked (Fiske & Chandler, 2011; Table S2(A) and S2(B)). For point counts, detections at all sites were used for modelling detection probability, including sites that did not have ARUs installed (BC: n=129 sites; Chile: n=150 sites). The number of repeated surveys at each site ranged from 3 (point count only sites) to 23 (sites with both ARU and point count data) (BC: n=1065 site-surveys; Chile: 900 site-surveys). We restricted the families modelled to those that occupied at least 15% of sites in any of our three habitat types. Detection modeling was restricted to those habitats where 90% of occupied sites occurred.
Because ARUs were repeatedly sampled within-day with spacing of ~ 1 hour (58 ±13 min), we expected temporal autocorrelation between surveys within-site and incorporated this into our models using a first-order Markov covariate (Wright et al., 2016).
Our base detection probability model was:
detection ~ wind score + hours after sunrise + hours after sunrise2 + date + date2 + canopy cover + canopy cover2 + temporal autocorrelation term
And site occupancy probability was modeled as:
occupancy ~ site elevation + residuals of canopy cover by elevation.
Canopy cover residuals were used in the occupancy model to account for co-linearity between elevation and canopy (i.e. trees become more sparse at higher elevations). In Chile, canopy cover values at the time of sampling were used for modeling detection to account for leafing-out, while maximum canopy cover at each site (reflective of habitat type) was used for modeling occupancy.
To our base detection model, we added an effect of method (ARU vs. PC) on detection plus interactions between method and 3 survey parameters where effects on detection were predicted to differ between ARU and point counts. These were: canopy cover, hours after sunrise, and date. We tested the performance of the basic model, the basic + method model, and the seven possible models that included combinations of ‘method x survey condition’. In total, nine detection models were tested for each bird family.
We selected the best model for each family based on QAIC, incorporating ĉ for the most complex model (detection ~ basic model + method + all three method interactions) (Burnham & Anderson, 2002, MacKenzie et al., 2017; Mazerolle, 2017). Goodness-of-fit tests were run for these best models and, where ĉ > 1, we inflate the CIs accordingly. We do not present output for any family where ĉ > 4 (suggesting lack of fit; Mazerolle, 2017) or where ĉ < 0.3 (indicating insufficient data). We report the 84% and 95% CIs: no overlap at the 84% CI is consistent with a significant difference (P<0.05) between methods (Payton et al., 2003) while the 95% CI represents the 95% CI of the actual detection probability. Further detail on detection probability modelling is available in the Supplement.
e assessed the efficiency of single-method and mixed-method sampling protocols as the percent of the total community detected as a function of hours of effort. For ARUs, site visitation and sample processing cost was assessed at 40 min/site and 9 min/sample. For point counts, these values were 20 min/site and 7 min/sample. When protocols were mixed, we assumed that the visitation cost was shared for ARUs and PCs, i.e. that point counts were conducted when ARUs were deployed and/or retrieved. In protocols that involved 3 point counts per site, the additional point count incurred an additional visitation cost (20 min/site). We randomly sampled ARU and point count surveys with replacement (10,000X) at each survey site to produce a bootstrapped mean species richness detected (±SE) across all sites for different sampling intensities of: ARUs alone (1-15 counts/site), point counts alone (1-3 counts/site), and point count plus ARU surveys (1 point count plus 1-15 ARU counts/site, 2 point counts plus 1-15 ARU counts/site, etc.). We identify the “best” protocols as those that detected the greatest percentage of the total community for the least effort.