Screening of titles, abstracts, and main text
Titles and abstracts were screened for suitability. Suitable abstracts
mentioned at least one group of soil fauna measured in at least one
reference, undisturbed, or control site, and one site impacted by a GC.
To aid in screening of titles and abstracts, we used a machine-learning
algorithm in the program Abstrackr alongside human-screening. Whilst the
abstracts and titles were being manually screened, all papers were being
dynamically assigned confidence scores by Abstrackr. After the manual
screening of 9,535 abstracts (of which 6,143 were irrelevant and 3,389
were included), the Abstrackr confidence score was 0.58 or under for the
remaining 15,444 articles, a low enough value to indicate the remaining
articles were not relevant for the meta-analysis. This cut-off value of
0.58 was chosen based on a quality control procedure in which we
randomly sampled 5% of the records within each 0.1 band of confidence
scores, and screened their titles to check that they ‘may be’ suitable
or were “definitely not” suitable. The cut-off confidence score was
then based on the point where the number of ‘definitely not’ suitable
papers was the majority of the titles within a 0.01 band. Thus, the
15,444 articles were not considered further.
The full texts of the 3,389 papers with relevant abstracts and titles
were then manually screened. In order to be suitable for the analysis
the article needed to have (1) measured at least one soil fauna group
(e.g., earthworms, macro-fauna, oribatid mites), (2) captured the impact
of one or several GCs according to our GC-specific inclusion criteria
(see supplementary materials), and (3) presented the necessary data
(mean values, variance, n’s) to allow us to calculate an effect size for
the meta-analysis.
As no definition, catalogue, or list exists of organisms considered
‘soil biodiversity’ , soil fauna was determined based on sampling
protocol. Suitable sampling methods included soil cores, hand-sorting
excavated soil blocks, or mustard extraction. Pitfall traps on their own
were not considered suitable, as these data are more representative of
activity densities of ground-dwelling invertebrates . However, if the
pitfall traps were associated with another method targeting the soil,
they were considered suitable .