2.6 Statistical analyses
Alpha diversity metrics, including the Shannon index and Pielou’s
evenness, were used to estimate the community diversity of abundant and
rare taxa. Faith’s phylogenetic diversity (PD) was used to estimate the
phylogenetic community diversity of abundant and rare taxa. Shannon
index, Pielou’s evenness, and Faith’s PD index were calculated using the
“diversity” function in R package “vegan” (Oksanen et al., 2015).
Further, the standardized index of effect size measure of the mean
nearest-taxon distance (SES.MNTD) was applied to evaluate the
phylogenetic clustering of abundant and rare taxa. The SES.MNTD index
was calculated using the “ses.mntd” function with the mean of the null
distribution (null.model = ‘taxa.labels’ in ‘ses.MNTD’, 999
randomizations) in the R package
“picante” (Kembel et al., 2010).
Statistical differences in the above diversity indices were determined
using Wilcoxon rank-sum test in SPSS 22.0 (IBM, Armonk, NY, USA).
The community composition and phylogenetic variations of abundant and
rare taxa were calculated based on the Bray–Curtis dissimilarity
matrices and the beta mean nearest taxon distance (βMNTD). The
Bray–Curtis dissimilarity was calculated using the “vegdist” function
in R package “vegan” (Oksanen et al., 2015), whereas the βMNTD
distance was calculated using the function “comdistnt” the R package
“picante” (Kembel et al., 2010).
Pairwise geographic distance was determined based on the latitude and
longitude of each site using the function “distGeo” in R package
“geosphere” (Hijmans et al., 2016). Linear regression was used to
assess the relationships between bacterial community similarity (1 -
Bray–Curtis dissimilarity) or phylogenetic similarity (1 - βMNTD) and
geographic distance. Variation partitioning analysis (VPA) was used to
tease apart the pure effects of physicochemical factors, land use types,
and space on the variation of abundant and rare taxa. Spatial variables
in the VPA were calculated using principal coordinates of neighbor
matrices (PCNM) analysis and calculated in the“vegan” R package with
the “pcnm” function (Borcard and Legendre, 2002; Oksanen et al.,
2015).
Threshold indicator taxa analysis (TITAN) was carried out to detect and
interpret biodiversity and environmental thresholds of abundant and rare
taxa by using the R package “TITAN2” (Baker and King, 2010). Briefly,
the sums of indicator taxa scores for bacterial OTUs were used to
determine lower and upper thresholds of changes in abundant and rare
taxa based on each environmental variable (Jiao and Lu, 2020; Wan et
al., 2021a; Wan et al., 2021b).
Phylogenetic signal is tendency for related species to resemble each
other more than they resemble species drawn at random from the
phylogenetic tree (Blomberg and Garland Jr, 2002). Before testing for
phylogenetic signals, we firstly obtained potential trait information
about both abundant and rare taxa via the Spearman’s correlations
between the relative abundances of OTUs and environmental variables
(Oliverio et al., 2017). The OTUs that showed significant associations
(positive or negative) with a given environmental variable were
identified as species with a preference for that environmental variable.
For example, the OTUs positively or negatively correlated with pH were
identify as “alkaline-preferred taxa” or “acid-preferred taxa”.
Subsequently, phylogenetic signals for the environmental preference of
abundant and rare taxa were calculated via Blomberg’s K statistic
approach using the “multiPhylosignal” function in the “picante” R
package (Kembel et al., 2010). The Blomberg’s K statistic approach tests
whether the observed trait variation across a phylogeny is smaller than
expected according to a Brownian motion model of trait evolution
(Blomberg et al., 2003). K values higher than 1 implies strong
phylogenetic signals and conservatism of traits, while K values closer
to zero indicate a random or convergent pattern of evolution.
Ecological community assembly analyses were performed by applying the
null model within the framework described by Stegen et al. (2013).
Network analysis was constructed based on the Spearman’s rank
correlations as described by Hu et al. (2017). Detailed descriptions of
the ecological community assembly and network analyses were summarized
in supplementary material.