Data analyses
The quality of the raw reads was assessed using Fastp v0.23.1 (Chen et al., 2018 ), with read pairs discarded if: (1) adapter contamination was present in either read; (2) more than 10% of bases were uncertain in either read; and (3) the proportion of low-quality (Phred quality < 5) bases exceeded 50% in either read.
Due to the unavailability of the whole-genome sequence for V. stejnegeri , the genome sequence of its close relative,Protobothrops mucrosquamatus , was used as a host reference genome, downloaded from the NCBI database (https://ftp.ncbi.nlm.nih.gov/genomes/al/GCA/001/527/595/GCA001527695.3_P.Mucros_1.0/). Sequence alignment was performed using Bowtie2 v2.4.1 (Langmead & Salzberg, 2012; Langmead et al., 2019 ) to remove the host sequence. After host removal, the sequences were aligned to standard archaea, bacteria, human, UniVec_Core, and viral databases using Kraken2 v2.0.7 to obtain annotation information and abundance tables (Wood, Lu & Langmead, 2019 ). Target sequences were then partitioned into smaller k-mer segments for assembly using Megahit v1.2.9 (Li et al., 2015, 2016 ) with the parameters (-t 8 -m 0.95 –min-contig-len 300 –k-min 51 –k-max 127 –k-step 20). The assembled sequences were clustered using Cd-hit v4.8.1 (Fu et al., 2012 ) to obtain de-redundant sequences with the parameters (-c 0.95 -aS 0.9 -g 1 -sc 1 -sf 1 -T 8 -M 8000). Protein sequences were predicted using Prodigal v2.6.3 (Hyatt et al., 2010 ). Finally, the annotated protein sequences were obtained by comparison against the carbohydrate database v3.0.5 with Run_dbcan (Zhang et al., 2018 ), with quantification performed using Salmon v0.14.1 (Patro et al., 2017 ).
Grouped percentage stacked column charts representing species abundance were produced using the Wekemo Bioincloud platform (https://bioincloud.tech/task-meta). Several diversity indices were calculated in R v4.3.1 (R Development Core Team 2020), including α-diversity indices such as abundance-based coverage estimation, Simpson’s diversity index, Chao1 estimator, Shannon-Weiner index, observed species, and Goods’s coverage index. The Kruskal-Wallis test was completed to compare α-diversity indices among the three populations.
Principal coordinate analysis (PCoA) using Bray-Curtis distance was performed to detect differences in the composition of gut microbiota across the three populations of V. stejnegeri . Adonis analysis of variance (ANOVA) was employed to assess significant differences among the three populations using R v4.3.1. Additionally, linear discriminant analysis (LDA) effect size (LEfSe) was used to examine significant disparities in the abundance of gut bacteria among the three populations at the phylum to genus levels and identify components with notable differences. LDA was then applied to assess the impact of each component on differences in abundance (Segata et al. 2011 ), with the results visualized using R v4.3.1. The linear discriminant analysis criterion was set to a log-transformed value greater than or equal to 2 with a base of 10.