Data analyses
The quality of the raw reads was assessed using Fastp v0.23.1
(Chen et al., 2018 ), with read pairs discarded if: (1) adapter
contamination was present in either read; (2) more than 10% of bases
were uncertain in either read; and (3) the proportion of low-quality
(Phred quality < 5) bases exceeded 50% in either read.
Due to the unavailability of the whole-genome sequence for V.
stejnegeri , the genome sequence of its close relative,Protobothrops mucrosquamatus , was used as a host reference
genome, downloaded from the NCBI database
(https://ftp.ncbi.nlm.nih.gov/genomes/al/GCA/001/527/595/GCA001527695.3_P.Mucros_1.0/).
Sequence alignment was performed using Bowtie2 v2.4.1 (Langmead &
Salzberg, 2012; Langmead et al., 2019 ) to remove the host sequence.
After host removal, the sequences were aligned to standard archaea,
bacteria, human, UniVec_Core, and viral databases using Kraken2 v2.0.7
to obtain annotation information and abundance tables (Wood, Lu &
Langmead, 2019 ). Target sequences were then partitioned into smaller
k-mer segments for assembly using Megahit v1.2.9 (Li et al., 2015,
2016 ) with the parameters (-t 8 -m 0.95 –min-contig-len 300
–k-min 51 –k-max 127 –k-step 20). The assembled sequences were
clustered using Cd-hit v4.8.1 (Fu et al., 2012 ) to obtain
de-redundant sequences with the parameters (-c 0.95 -aS 0.9 -g 1 -sc 1
-sf 1 -T 8 -M 8000). Protein sequences were predicted using Prodigal
v2.6.3 (Hyatt et al., 2010 ). Finally, the annotated protein
sequences were obtained by comparison against the carbohydrate database
v3.0.5 with Run_dbcan (Zhang et al., 2018 ), with quantification
performed using Salmon v0.14.1 (Patro et al., 2017 ).
Grouped percentage stacked column charts representing species abundance
were produced using the Wekemo Bioincloud platform
(https://bioincloud.tech/task-meta). Several diversity indices were
calculated in R v4.3.1 (R Development Core Team 2020), including
α-diversity indices such as abundance-based coverage estimation,
Simpson’s diversity index, Chao1 estimator, Shannon-Weiner index,
observed species, and Goods’s coverage index. The Kruskal-Wallis test
was completed to compare α-diversity indices among the three
populations.
Principal coordinate analysis (PCoA) using Bray-Curtis distance was
performed to detect differences in the composition of gut microbiota
across the three populations of V. stejnegeri . Adonis analysis of
variance (ANOVA) was employed to assess significant differences among
the three populations using R v4.3.1. Additionally, linear discriminant
analysis (LDA) effect size (LEfSe) was used to examine significant
disparities in the abundance of gut bacteria among the three populations
at the phylum to genus levels and identify components with notable
differences. LDA was then applied to assess the impact of each component
on differences in abundance (Segata et al. 2011 ), with the
results visualized using R v4.3.1. The linear discriminant analysis
criterion was set to a log-transformed value greater than or equal to 2
with a base of 10.