Sequence Analysis
Adaptors and low-quality sequences below 23 and 30 Phred quality parameters for maximum average error and maximum error at the end, respectively, were trimmed using SeqyClean v.1.9.10 (Zhbannikov et al ., 2017). Only high-quality paired-end sequences were used for further analysis. Contaminant sequences were removed using HISAT2 v2.0.5 (Kim et al ., 2015). For contaminant identification, a contaminant bank was built containing sequences of Bacteria, Nematoda, Oomycetes, and Platyhelminthes with the assembled complete genome, chromosome, or scaffold from reference or representative genomes (RefSeq category). Data were downloaded in December 2017 using NCBI Entrez Direct E-utilities v6.60 (Kans, 2018). Quality-filtered reads were aligned against the contaminant bank database using the HISAT2 flag ”un-conc” to recover paired-end sequences that failed to align concordantly with contaminants. De novo transcriptome assembling was generated using Trinity v.2.8.3 (Haas et al ., 2013) with default parameters. The longest isoforms were extracted using the TRINITY package script get_longest_isoform_seq_per_trinity_gene.pl.
The completeness of our transcriptome assembly was assessed using the Benchmarking Universal Single-Copy Ortholog v5.2.2 (BUSCO, http://busco.ezlab.org, Manniet al. , 2021). Thede novo assembled transcriptome was annotated against the Viridiplantae SwissProt database (www.uniprot.org) (downloaded on September 18, 2019) using the BLASTX program, of BLAST suite (Camachoet al. , 2008), with 1e-5 e-value threshold. Annotated assemblies had their GeneOntology (GO) terms retrieved with the tool ”Retrieve/ID mapping” on the UniProt website (www.uniprot.org), and parental terms were recovered using Blast2GO (version 5.2.5) ”Combined Graph” tool (Conesa et al ., 2005).
Further, functional annotation of B. magnifica cambium and differentiating xylem transcriptome was compared to that of the model species Populus (P. × euramericana ) and Eucalyptus grandis annotated by Zinkgraf et al .(2017) over the data generated by Xu et al. (2014). A. thaliana orthologs of each expressed gene from the two model species retrieved by the authors were used to recover the associated GO terms using the UniProt website. For transcriptome comparison, we selected the Biological Process GOs associated with at least 10 % of the transcripts using the Blast2GO ”Combined Graph” tool for all three species. A heat map created to visualize the functional comparison between B. magnifica ,Populus, and Eucalyptus was produced using the online tool Morpheus (Broad Institute, URL: https://software.broadinstitute.org/morpheus/).
For differential expression analysis, the 12 RNA-Seq samples were mapped over the GO term-associated transcripts using Salmon v.0.11.3 (Patro et al ., 2017). The transcript abundances were then used to identify the differentially expressed genes (DEGs) between self-supporting and lianescent-phases. The significance of differences was assessed with the package edgeR v.3.26.8 (Robinson et al ., 2010) from Bioconductor v.3.9 software (Huber et al ., 2015). For this purpose, we normalized the 12 libraries with the TMM method (Robinson & Oshlack, 2010) using the calcNormFactors() function, while common dispersions were calculated with the estimateCommonDisp() function (Robinson & Smyth, 2008). The threshold adopted for significance assessing was p < 0.05, false discovery rate (FDR) < 0.05, and log2 fold change > 2. GO functional enrichment analysis of DEGs was performed by Fisher exact test (FDR < 0.05) using Blast2GO (version 5.2.5).
Results