3.1 Effects of UV-B stress on the leaves of G. hirsutum
The effect of current UV-B stress (16 kJ m–2 d–1) on cotton leaves induced significant physiological responses immediately after 6 h of treatment. Indicative signs, such as distinct brown spots (Fig. 1A), dark blue stains (Fig. 1B), and brown precipitation (Fig. 1C), appeared in the staining of UV-B-treated leaves by NBT, Evans blue, and DAB, respectively. In addition, the levels of O2 and H2O2 in the leaves of the UV group were significantly increased by 28-fold (Fig. 1D) and nearly 3-fold (Fig. 1E), respectively, compared with those of the CK group. Notably, phenotypic effects of UV-B stress on cotton leaves were lagging. It was not until two days after UV-B treatment that cotton leaves showed visible signs of drying and wilting (Fig. 1F and 1G). These results suggest that UV-B stress promotes the accumulation of O2 and H2O2 in cotton leaves, and the increased ROS content eventually damages leaf cells.
Structural genes evolution of the GSH metabolic pathway in G. hirsutum
After scanning the whole genome of G. hirusum, we identified a total of 205 structural genes in the GSH metabolic pathway (Table S3). Sequence analysis showed that most of the genes encoding 12 structural enzymes in the tetraploid cotton were doubled by hybridization and polyploidization. These 205 genes actually involve 125 gene loci, of which 80 include both At- and Dt-homoeologs (160 genes), 18 and 27 have only At-homoeologs or Dt-homoeologs, respectively. With the exception of the GCL and GS, which are encoded by a pair of homoeologs genes, the other 10 enzymes (G6PDH, GGCT, GGT, GPX, GR,GST, IDH, LAP, OXP, and PGDC) in the cotton genome are encoded by multiple gene loci, ranging from two in theOXP families to 76 (121 genes) in the GST family. In addition, according to an earlier study on diploid cottons (Dong et al., 2016), these GST genes can further subdivided into eight classes, namely GSTF (Phi ), GSTU (Tau ), GSTT (Theta ), GSTZ (Zeta ),GSTL (Lambda ), DHAR (dehydroascorbate reductase),TCHQD (tetrachlorohydroquinone dehalogenase), and EF1Bγ (the eukaryotic translation elongation factor 1B). Here, we found out 70Tau , 15 Phi , 10 Theta , seven Lambda, sixDHAR , four Zeta , three EF1Bγ, and two TCHQD. Moreover, we also identified four additional genes (two loci) that have glutathione S-transferases structural domains but encode microsomal glutathione S-transferases (MGST) (Table S3).
These structural genes are distinctive in terms of gene length, gene structure, and sequence similarity. The shortest gene is a GST gene (Gohir.D13G137000) with a length of 261 bp, and the longest gene is an OXP gene (Gohir.D02G190400) with a length of 3,873 bp. In addition, gene length polymorphisms varied across the 12 gene families. The sequence lengths of the gene members of the GST gene family differed the most, with an approximate 5-fold difference in gene length (261 bp vs. 1,266 bp), whereas the sequence lengths of two genes (a pair of homoeologs) of the GS gene family were exactly equal (1,650 bp). The gene structures of different members in the 12 gene families also varied, with a member (Gohir.D02G122700) in the GR family having the most exons (18), and three members (Gohir.A11G134900, Gohir.D01G170400, Gohir.D11G141100) in the GST family, two members (Gohir.A03G167300, Gohir.D02G190400) in the OXP family and two members (Gohir.A12G231700 ,Gohir.D12G235450) in the PGDC family having the fewest exon (only one) (Table S3).
Pairwise comparison of protein sequences (Table S4)show that the sequence similarity profile within the 12 gene families is remarkably diverse. The greatest variation was in the GST gene family, with pairwise sequence similarity between ranging from 10.9% to 100%. Even within subfamilies, some gene pairs with low sequence similarity (less than 30%) were identified in the GSTF, GSTL, GSTT and GSTU classes of the GST gene family. Reanalyzing these divergent gene pairs, we found that when pairwise comparisons were made, one of the genes being too short in length would result in lower sequence similarity. For example, Gohir.D08G200400 is the shortest gene (522 bp) in the GSTTs, the sequence similarity between this gene and the other longer gene Gohir.D13G021400 (966 bp) in the GSTLs is only 24%. And a gene Gohir.D04G134000 with the shortest length (309 bp) in the LAP gene family shared only 15–18% sequence similarity with the other longer LAP genes (1,530–1,863 bp) (Table S3, S4). Notably, most of the short genes identified in this study are usually alternative splicing isoforms. The two genes ( Gohir.D04G134000 and Gohir.D08G200400) mentioned above have at least 10 and three isoforms, respectively. The result suggests that these particular genes may undergo different patterns of regulatory evolution.
In order to evaluate the phylogenetic relationships of these 205 structural genes, we constructed a ML tree based on the amino acid sequences of these genes (Fig. 2). Unsurprisingly, most of these structural genes encoding the same enzyme were grouped into the same clade in the tree, and most of gene loci were amplified by retaining homoeologous copies. In addition, eight subclades representing DHAR , EF1Bγ, Lambda , Phi , Tau ,TCHQD, Theta , and Zeta -like GST genes