Benjamin Goldstein

and 1 more

Participatory science (or ”citizen science”) records are becoming increasingly useful for wildlife monitoring due to their volume and spatiotemporal coverage. However, statistical analysis using these data can be challenging due to the many sources of bias that need to be corrected. Many previous studies characterize sampling biases across entire participatory science datasets, such as spatial heterogeneity in sampling effort or species preferences. User-level heterogeneity in sampling behavior is less well studied, but it may be just as important as dataset-level bias in contributing to error in downstream analyses. Here, we investigate user-level novelty and specialization bias. Novelty bias occurs when an individual observer preferentially reports species that have not seen before, while specialization bias occurs when an observer preferentially reports species they have previously observed (i.e., they specialize in particular species). We provide the first test of this kind of user-level sampling bias in participatory science data by analyzing the sampling histories of more than 540 observers on the popular participatory science platform iNaturalist in Pennsylvania, USA. We find evidence of specialization or novelty bias in the overall sampling behavior of 66% of the observers considered. Specialization bias was more than 5 times more common than novelty bias, indicating that observers reported species they had reported previously at a higher rate than expected. Looking within taxonomic groups, 41% of observers deviated from unbiased sampling. Novelty bias and specialization bias were both common within taxa. These findings suggest that iNaturalist observers often specialize in favorite taxa or species, while within taxa some users simultaneously seek out previously unobserved species.