javascript:void(0)Read preprocessing and DEG analyses
Quality of the reads were checked using fastqc
(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Adapters
were clipped and low quality bases (phred score < 20) were
removed from the reads in 4 bp sliding windows using Trimmomatic
(Bolger, Lohse, & Usadel, 2014). Genome assembly and annotation ofP. australis was obtained from Clean reads were mapped to the
reference genome using STAR aligner (Dobin et al., 2013). After sorting
using Samtools (Danecek et al., 2021), the mapped reads were processed
using StringTie (Pertea et al., 2015) to obtain the gene and transcript
count matrix. Principal component analysis (PCA) was conducted on
differentially expressed genes among the treatment groups using DEseq2
following rlog transformation of gene counts, which represent the number
of reads mapped to the reference genome (Love, Huber, & Anders, 2014).
DESeq2 provides statistical routines for determining differential
expression in digital gene expression data using a model based on the
negative binomial distribution. The resulting P values were adjusted
using the Benjamini and Hochberg’s approach for controlling the false
discovery rate. Genes with an adjusted P-value < 0.01 & Fold
Change≥2 found by DESeq2 were assigned as differentially expressed. Gene
Ontology (GO) enrichment assessment analyses sets of genes in the
context of GO categories to determine whether certain functional
categories are overrepresented or enriched within the background whole
genome gene sets. Here we tested enriched GO terms for the
differentially expressed genes (DEGs) using goatools (Klopfenstein et
al., 2018), using a criterion of P <0.05 with Bonferroni
correction. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway
analysis was performed using KAAS (Moriya, Itoh, Okuda, Yoshizawa, &
Kanehisa, 2007) and the enriched pathway of differentially expressed
genes were analyzed using clusterProfiler (Wu et al., 2021). The Venn
diagram was performed using ggvenn (Yan & Yan, 2021).
Coefficient of variation (CVa) for gene expression data was calculated
by first normalizing the raw read counts using the variance stabilizing
transformation (VST) method, which stabilizing variance across
expression levels. Samples were then grouped by population and
treatment. The CVa, defined as the ratio of the standard deviation to
the mean expression level, was computed for each gene, with genes
exhibiting zero variance excluded from further analysis. Finally, we
visualized the distribution of CVa values using density plots,
highlighting peaks and medians.