2.6 Statistical analyses
Alpha diversity metrics, including the Shannon index and Pielou’s evenness, were used to estimate the community diversity of abundant and rare taxa. Faith’s phylogenetic diversity (PD) was used to estimate the phylogenetic community diversity of abundant and rare taxa. Shannon index, Pielou’s evenness, and Faith’s PD index were calculated using the “diversity” function in R package “vegan” (Oksanen et al., 2015). Further, the standardized index of effect size measure of the mean nearest-taxon distance (SES.MNTD) was applied to evaluate the phylogenetic clustering of abundant and rare taxa. The SES.MNTD index was calculated using the “ses.mntd” function with the mean of the null distribution (null.model = ‘taxa.labels’ in ‘ses.MNTD’, 999 randomizations) in the R package “picante” (Kembel et al., 2010). Statistical differences in the above diversity indices were determined using Wilcoxon rank-sum test in SPSS 22.0 (IBM, Armonk, NY, USA).
The community composition and phylogenetic variations of abundant and rare taxa were calculated based on the Bray–Curtis dissimilarity matrices and the beta mean nearest taxon distance (βMNTD). The Bray–Curtis dissimilarity was calculated using the “vegdist” function in R package “vegan” (Oksanen et al., 2015), whereas the βMNTD distance was calculated using the function “comdistnt” the R package “picante” (Kembel et al., 2010).
Pairwise geographic distance was determined based on the latitude and longitude of each site using the function “distGeo” in R package “geosphere” (Hijmans et al., 2016). Linear regression was used to assess the relationships between bacterial community similarity (1 - Bray–Curtis dissimilarity) or phylogenetic similarity (1 - βMNTD) and geographic distance. Variation partitioning analysis (VPA) was used to tease apart the pure effects of physicochemical factors, land use types, and space on the variation of abundant and rare taxa. Spatial variables in the VPA were calculated using principal coordinates of neighbor matrices (PCNM) analysis and calculated in the“vegan” R package with the “pcnm” function (Borcard and Legendre, 2002; Oksanen et al., 2015).
Threshold indicator taxa analysis (TITAN) was carried out to detect and interpret biodiversity and environmental thresholds of abundant and rare taxa by using the R package “TITAN2” (Baker and King, 2010). Briefly, the sums of indicator taxa scores for bacterial OTUs were used to determine lower and upper thresholds of changes in abundant and rare taxa based on each environmental variable (Jiao and Lu, 2020; Wan et al., 2021a; Wan et al., 2021b).
Phylogenetic signal is tendency for related species to resemble each other more than they resemble species drawn at random from the phylogenetic tree (Blomberg and Garland Jr, 2002). Before testing for phylogenetic signals, we firstly obtained potential trait information about both abundant and rare taxa via the Spearman’s correlations between the relative abundances of OTUs and environmental variables (Oliverio et al., 2017). The OTUs that showed significant associations (positive or negative) with a given environmental variable were identified as species with a preference for that environmental variable. For example, the OTUs positively or negatively correlated with pH were identify as “alkaline-preferred taxa” or “acid-preferred taxa”. Subsequently, phylogenetic signals for the environmental preference of abundant and rare taxa were calculated via Blomberg’s K statistic approach using the “multiPhylosignal” function in the “picante” R package (Kembel et al., 2010). The Blomberg’s K statistic approach tests whether the observed trait variation across a phylogeny is smaller than expected according to a Brownian motion model of trait evolution (Blomberg et al., 2003). K values higher than 1 implies strong phylogenetic signals and conservatism of traits, while K values closer to zero indicate a random or convergent pattern of evolution.
Ecological community assembly analyses were performed by applying the null model within the framework described by Stegen et al. (2013). Network analysis was constructed based on the Spearman’s rank correlations as described by Hu et al. (2017). Detailed descriptions of the ecological community assembly and network analyses were summarized in supplementary material.