Sequence Analysis
Adaptors and low-quality sequences below 23 and 30 Phred quality
parameters for maximum average error and maximum error at the end,
respectively, were trimmed using SeqyClean v.1.9.10 (Zhbannikov et
al ., 2017). Only high-quality paired-end sequences were used for
further analysis. Contaminant sequences were removed using HISAT2 v2.0.5
(Kim et al ., 2015). For contaminant identification, a contaminant
bank was built containing sequences of Bacteria, Nematoda, Oomycetes,
and Platyhelminthes with the assembled complete genome, chromosome, or
scaffold from reference or representative genomes (RefSeq category).
Data were downloaded in December 2017 using NCBI Entrez Direct
E-utilities v6.60 (Kans, 2018). Quality-filtered reads were aligned
against the contaminant bank database using the HISAT2 flag ”un-conc” to
recover paired-end sequences that failed to align concordantly with
contaminants. De novo transcriptome assembling was generated
using Trinity v.2.8.3 (Haas et al ., 2013) with default
parameters. The longest isoforms were extracted using the TRINITY
package script
get_longest_isoform_seq_per_trinity_gene.pl.
The completeness of our
transcriptome assembly was assessed using the Benchmarking Universal
Single-Copy Ortholog v5.2.2 (BUSCO, http://busco.ezlab.org, Manniet al. , 2021). Thede novo assembled transcriptome was annotated against the
Viridiplantae SwissProt database (www.uniprot.org) (downloaded on
September 18, 2019) using the BLASTX program, of BLAST suite (Camachoet al. , 2008), with 1e-5 e-value threshold.
Annotated assemblies had their GeneOntology (GO) terms retrieved with
the tool ”Retrieve/ID mapping” on the UniProt website (www.uniprot.org),
and parental terms were recovered using Blast2GO (version 5.2.5)
”Combined Graph” tool (Conesa et al ., 2005).
Further, functional annotation of B. magnifica cambium and
differentiating xylem transcriptome was compared to that of the model
species Populus (P. × euramericana ) and Eucalyptus
grandis annotated by Zinkgraf et al .(2017) over the data
generated by Xu et al. (2014). A. thaliana orthologs of
each expressed gene from the two model species retrieved by the authors
were used to recover the associated GO terms using the UniProt website.
For transcriptome comparison, we selected the Biological Process GOs
associated with at least 10 % of the transcripts using the Blast2GO
”Combined Graph” tool for all three species. A heat map created to
visualize the functional comparison between B. magnifica ,Populus, and Eucalyptus was produced using the online tool
Morpheus (Broad Institute, URL:
https://software.broadinstitute.org/morpheus/).
For differential
expression analysis, the 12 RNA-Seq samples were mapped over the GO
term-associated transcripts using Salmon v.0.11.3 (Patro et al .,
2017). The transcript abundances were then used to identify the
differentially expressed genes (DEGs) between self-supporting and
lianescent-phases. The significance of differences was assessed with the
package edgeR v.3.26.8 (Robinson et al ., 2010) from Bioconductor
v.3.9 software (Huber et al ., 2015). For this purpose, we
normalized the 12 libraries with the TMM method (Robinson & Oshlack,
2010) using the calcNormFactors() function, while common dispersions
were calculated with the estimateCommonDisp() function (Robinson &
Smyth, 2008). The threshold adopted for significance assessing was p
< 0.05, false discovery rate (FDR) < 0.05, and log2
fold change > 2. GO functional enrichment analysis of DEGs
was performed by Fisher exact test (FDR < 0.05) using Blast2GO
(version 5.2.5).
Results