2.8 Physical Genome Intervals
Physical genome intervals in the P. trichocarpa genome (v3.0)
were examined for each significant QTL for biotic associations. The
intervals were defined as 1 Mb regions centered on the marker with the
highest LOD score. Fixed physical genome sizes were used rather than
intervals defined based on LOD scores due to the large variation in
magnitude of LOD observed for the significant QTL. For example,
intervals of 1 LOD centered on the QTL ranged in size from 169 to 4620
kb. Much of this variation was likely due to variation in marker density
and local recombination rates, in addition to phenotyping and genotyping
error. We believe that a fixed 1 Mb interval is a more consistent and
conservative approach. On average, this represents approximately 6.34
cM, based on a total map size of 2617 cM and a total assembled genome
length of 420 Mb.
Orthologous intervals were identified in the P. deltoides clone
WV94 reference genome (v2.1) obtained from Phytozome (Goodstein et al.,
2012). Orthology was determined using a combination of protein sequence
conservation and synteny using MCScanX (Wang et al., 2012). Briefly, all
proteins were compared in all-vs-all searches using blastp both within
genomes and between genomes. These were then chained into collinear
segments using the MCScanX algorithm. Orthologous segments were
identified based on the presence of large numbers of gene pairs in
collinear order with high sequence identity (median blastp E score
<1e-180) (Figure 2). Synonymous (Ks) and nonsynonymous Ka)
nucleotide substitution rates were calculated using the Bioperl
DNAstatistics module (Stajich, 2002) (Table S1), domain composition
(Table S2), and Gene Ontology (GO) terms (Table S3) were obtained for
each genome from Phytozome (v12.1). Intervals were customized for the
grandparents of the pseudo backcross progeny (clones 93-968 for P.
trichocarpa and D124 for P. deltoides ) using
~150X of 2x250 paired end Illumina sequences. These were
aligned to the respective reference genome for each species using bwa
mem with default parameters. SNPs and small indels were identified using
samtools mpileup (Li, 2011; Li et al., 2009), and sequence depth was
extracted using vcftools (Danecek et al., 2011). Sequences were
converted using the vcftools utility vcf-consensus. Genes with no
coverage in the alignments were excluded from the intervals for each
species.