2.6 Gene co-expression network construction and visualization
In order to fully understand the regulatory networks of the GSH pathway under UV-B stress, we employed the co-expression network analysis to search for potential regulatory genes of different structural genes. Since co-expression analyses require at least 15 samples to ensure the robustness and repeatability of the results. Therefore, in addition to the current three UV-B stress transcriptomes, we included additional 30 published transcriptome data of G. hirsutum under other stress treatments (drought, salt, heat, cold, CdCl2, NaOH, NaHCO3, and, Na2SO4) in subsequent analyses. The 30 transcriptome data were downloaded from the NCBI website (https://www.ncbi.nlm.nih.gov/Traces/study/). The SRA files of a total of 33 transcriptomes (Table S1) were converted into fasta files by using the fastq-dump (v.2.8.0) function of the SRA Toolkit (v.2.9.4) software (http://www.ncbi.nlm.nih.gov/books/NBK158900/). Raw data of the RNA-sequences were cleaned using Trim Galore (v. 0.6.10) software (Wang et al., 2022). Gene expression was quantified by salmon (v. 0.13.1) method (Patro et al., 2017) after indexed according to cotton reference genome (UTX_v2.1) (Chen et al., 2020). The resulting read count tables were normalized by rlog transformation, built in DESeq2 (Love et al., 2014). Gene co-expression networks were constructed using the WGCNA package in the R software (Langfelder & Horvath, 2008). All data were clustered to analyze the sample height. The soft threshold was chosen according to Rsquared > 0.85. The topological overlap matrix (TOM) was created using the resultant adjacency matrix. The genes were hierarchically clustered by TOM similarity, and the hierarchical clustering tree was cut by dynamic hybrid tree cutting algorithm to obtain modules. In order to comprehensively analyze the co-expression relationships between structural and regulatory genes, the cut-off value of weighted edge obtained from the WGCNA was set at a low value of 0.2.
The structural genes were used as target genes to search for their potential regulatory genes in the pathway based on the weighted co-expression correlations. Several co-expression networks were integrally generated based on cut-off weight values, and were visualized using Cytoscape software (v.3.8.0) (Shannon et al., 2003).
2.7 Detection of differentially expressed genes (DEGs) in the GSH pathway under UV-B stress
In order to detect the DEGs of the GSH pathway under UV-B stress, we assembled, relatively quantified, and annotated all genes in six leaf transcriptomes of the CK and UV groups. Paired-end clean reads were aligned to the cotton reference genome (UTX_v2.1) (Chen et al., 2020) using Hisat2 (v2.0.5) (Mortazavi et al., 2008). The mapped reads for each sample were assembled using StringTie (Pertea et al., 2015). The DEseq2 package (Love et al., 2014) was used to perform a differential expression analysis of the genes between the UV and CK groups (three biological replicates per group). The P- value for each gene was adjusted (Padj ) using the Benjamini and Hochberg’s method (Benjamini & Hochberg, 1995). Genes with the adjusted Padj < 0.05 and absolute log2 fold changes (log2FC) > 1 were considered as DEGs. Due to the influence of sequencing depth and gene length, the gene expression value of RNA-seq in subsequent analysis was represented by FPKM (Fragments Per Kilobase of transcript per Million reads mapped).