Target enrichment of long open reading frames and ultraconserved
elements to link microevolution and macroevolution in non-model
organisms
Abstract
Despite the increasing accessibility of high-throughput sequencing,
obtaining high-quality genomic data on non-model organisms without
proximate well-assembled and annotated genomes remains challenging. Here
we describe a workflow that takes advantage of distant genomic resources
and ingroup transcriptomes to select and jointly enrich long open
reading frames (ORFs) and ultraconserved elements (UCEs) from genomic
samples for integrative studies of microevolutionary and
macroevolutionary dynamics. This workflow is applied to samples of the
African unionid bivalve tribe Coelaturini (Parreysiinae) at basin and
continent-wide scales. Our results indicate that ORFs are efficiently
captured without prior identification of intron-exon boundaries. The
enrichment of UCEs was less successful, but nevertheless produced
substantial datasets. Exploratory continent-wide phylogenetic analyses
with ORF supercontigs (>515,000 parsimony informative
sites) resulted in a fully resolved phylogeny, the backbone of which was
also retrieved with UCEs (>11,000 informative sites).
Variant calling on ORFs and UCEs of Coelaturini from the Malawi Basin
produced ~2,000 SNPs per population pair. Estimates of
nucleotide diversity and population differentiation were similar for
ORFs and UCEs. They were low compared to previous estimates in mollusks,
but comparable to those in recently diversifying Malawi cichlids and
other taxa at an early stage of speciation. Skimming off-target sequence
data from the same enriched libraries of Coelaturini from the Malawi
Basin, we reconstructed the maternally-inherited mitogenome, which
displays the gene order inferred for the most recent common ancestor of
Unionidae. Overall, our workflow and results provide exciting
perspectives for integrative genomic studies of microevolutionary and
macroevolutionary dynamics in non-model organisms.