Hi-C analysis and chromosome assembly
To obtain a chromosome-scale genome assembly, the Hi-C library for
sequencing was constructed. The muscle tissue of the burbot was used to
prepare the library, according to Rao et al. (2014). High-quality Hi-C
fragment libraries were sequenced for the Illumina NovaSeq-6000
platform. The sequencing reads were mapped to the polished burbot genome
with Bowtie 1.2.22. The two read ends were independently aligned to the
genome, and only the read pairs with both ends uniquely aligned to the
genome were selected. Lachesis (Burton et al., 2013) with default
parameters was then applied to perform the chromosomal-level genome
assembly by using the corrected contigs and valid Hi-C reads. The
ggplot2 in the R package was applied to generate a genome-wide Hi-C
heatmap to evaluate the quality of the chromosomal-level genome
assembly.
Two methods were performed to assess the completeness and accuracy of
the genome assembly. First, the Illumina and PacBio reads were aligned
to the burbot assembly genome by using BWA-MEM (version 0.7.10-r789) (Li
& Durbin, 2009) and BLASR (Mark et al., 2012), respectively. Second,
the completeness of the genome assembly was evaluated by using BUSCO
(version 2.0) (Simão et al., 2015) to search the genome against the
Actinopterygii database, which consisted of 4,584 single copy orthologs.