2 | MATERIALS AND
METHODS
2.1 | Plant materials
collection and DNA
extraction
This study collected six medicinal plants of Polygonatum from
different regions (Figure S1, Table S1). The species was confirmed and
identified and all voucher specimens were stored at the Chinese Materia
Medica Resource Center, Anhui University of Chinese Medicine (Hefei,
China).
Healthy and fresh leaves were chosen to extract the complete genomic DNA
using a plant DNA mini kit (Plant DNA Kit D3485, Omega Bio-Tek,
Guangzhou, China). The purity, integrity, and concentration of the DNA
were checked using a NanoDrop 2000 spectrophotometer (Thermo Scientific,
Wilmington, DE, USA) and 1.0%(w/v ) agarose gel electrophoresis
(Wu et al., 2021). The concentration of DNA samples that meet the
requirements of chloroplast genome sequencing ≥ 20 ng/μL; the total
amount of samples ≥ 100 ng; OD260/280 = 1.8–2.2, and
high-quality DNA was used to construct gene libraries (Zhu et al.,
2018).
2.2 | Chloroplast DNA
sequencing, assembly and
annotation
Genesky Biotechnologies Inc. (Shanghai, China) was commissioned to use
Illumina HiSeq 4000 to randomly sequence the chloroplast genomes of each
DNA sample from Polygonatum plants. Genomic DNA was fragmented
after quality control and the adaptor was ligated to construct the
library. To obtain high-quality sequencing data and improve the accuracy
of subsequent bioinformatic analyses, quality control and filtering of
the original offline data must be performed. For example, excluding
sequences containing more than 3 N bases, eliminating sequences with
less than 60% of high-quality bases (Phred score ≥ 20), eliminating
low-quality bases at the 3’ end, and removing the sequences with lengths
less than 60 bp. Assembling clean reads at the contig level. According
to the reference near-source species, metaSPAdes software (Nurk et al.,
2017) was used for genome assembly, and the assembly results were
analyzed and corrected to determine whether the ring was formed, correct
the contig direction, and determine the initial base position. The
chloroplast genomes were annotated using CPGAVAS2 software (Shi et al.,
2019). GenBank files were drawn into a gene circle map using GeSeq
(https://chlorobox.mpimp-golm.mpg.de/geseq.html) (Tillich et al., 2017).
The sequence data and gene annotation information were uploaded to the
National Center for Biotechnology Information (NCBI) database.
2.3 | Structure analysis of
the chloroplast
genomes
SSRs, also called microsatellites or short tandem repeats (STRs), are
tandem repeats of DNA segments composed of 1–6 base pairs widely used
in genetic analysis as molecular markers. The SSR sites of each sample
genome were detected using the online software MISA (Beier et al., 2017)
(https://webblast.ipk-gatersleben.de/misa/),
with the minimum repeat parameters set at ten repeat units for
mononucleotide, five repeat units for dinucleotide, four repeat units
for trinucleotide, three repeat units for tetranucleotides,
pentanucleotides, and hexanucleotides (Wang et al., 2022). Forward,
palindromic, reverse, and complementary repeats, were predicted using
the REPuter (https://bibiserv.cebitec.uni-bielefeld.de/reputer), the
parameters are set to hamming distance = 3, maximum computed repeats =
5, 000 bp, minimal repeat size = 30 bp (Kurtz et al,2001).
Codon usage of the chloroplast genomes of six medicinal plants ofPolygonatum was investigated using the relative synonymous codon
usage (RSCU) module in the Python CAI package. In gene translation, the
frequency of synonymous codons corresponding to each amino acid is
discrepant; that is, some synonymous codons are applied more frequently
than others (Parvathy et al., 2022). The RSCU value represents relative
synonymous codon usage. For RSCU=1, codon usage without preference; RSCU
> 1, codon usage frequency is higher than expected; and
RSCU < 1, codon usage frequency is lower than expected (Sharp
et al., 1986). Microsoft Office Excel and TBtools (Chen et al., 2020)
were used to convert statistical data into visual graphs.
2.4 | Comparison of the
chloroplast
genomes
The expansion and contraction of inverted repeat regions in the
chloroplast genome may lead to changes in genome length. Using CPJSdraw
software (Xu et al., 2024), we detected the inverted repeat (IR)
boundary regions by comparing the locations of the coding genes.
Sequence alignment of the whole chloroplast genome was performed using
the online tool mVISTA (Fernández-Jiménez et al., 2021)
(http://genome.lbl.gov/vista/index.shtml) in shuffle-LAGAN mode.
DnaSP software (Rozas et al., 2017) was used to calculate the nucleotide
diversity based on sliding window analysis, setting the window length to
600 bp and the step size to 200 bp. To investigate the presence of
selective pressure on the chloroplast protein-coding genes amongPolygonatum , we used P. zanlanscianense as a reference,
and the coding sequences were used to calculate ka, ks values using
KaKs_Calculator2 (Wang et al., 2010).
2.5 | Phylogenetic tree
construction of Polygonatum medicinal
plants
Six medicinal plants of Polygonatum and other medicinal plants ofPolygonatum downloaded from the NCBI were used for phylogenetic
analysis, whereas Dioscorea aspersa and Dioscorea alata were set as outgroups. A total of 59 chloroplast complete sequences were
aligned using MAFFT (Katoh et al., 2013) and trimmed using TrimAL
(Capella-Gutiérrez et al., 2009). The best-fit model according to
Bayesian information criterion was K3Pu+F+I+I+R4, which was calculated
using ModelFinder (Kalyaanamoorthy et al., 2017). An IQ-TREE (Nguyen et
al., 2015) phylogenetic tree was constructed based on the whole
chloroplast sequences using the PhyloSuite platform (Zhang et al.,
2020). The tree is displayed on the iTOL (Letunic et al., 2021) website
(https://itol.embl.de/).