Nanopore Cas9-targeted sequencing enables accurate and simultaneous
identification of transgene integration sites, their structure and
epigenetic status in recombinant Chinese hamster ovary cells
Abstract
The integration of a transgene expression construct into the host genome
is the initial step for the generation of recombinant cell lines used
for biopharmaceutical production. The stability and level of recombinant
gene expression in Chinese hamster ovary (CHO) can be correlated to the
copy number, its integration site as well as the epigenetic context of
the transgene vector. Also, undesired integration events, such as
concatemers, truncated and inverted vector repeats, are impacting the
stability of recombinant cell lines. Thus, to characterize cell clones
and to isolate the most promising candidates it is crucial to obtain
information on the site of integration, the structure of integrated
sequence and the epigenetic status. Current sequencing techniques allow
to gather this information separately but do not offer a comprehensive
and simultaneous resolution. In this study, we present a fast and robust
nanopore Cas9-targeted sequencing (nCats) pipeline to identify
integration sites, the composition of the integrated sequence as well as
its DNA methylation status in CHO cells that can be obtained
simultaneously from the same sequencing run. A Cas9-enrichment step
during library preparation enables targeted and directional nanopore
sequencing with up to 724x median on-target coverage and up to 153 Kb
long reads. The data generated by nCats provides sensitive, detailed and
correct information on the transgene integration sites and the
expression vector structure, which could only be partly produced by
traditional Targeted Locus Amplification-Seq data. Moreover, with nCats
the DNA methylation status can be analyzed from the same raw data
without prior DNA amplification.