Genome assembly and evaluation
Long reads generated from PacBio sequencing were corrected and assembled
using CANU version 1.8 (Koren et al., 2017) with default parameters. The
initial assembly was polished using Pilon v1.22 (Walker et al., 2014)
with short reads from Illumina paired-end sequencing for three times.
Two haplotypes in part of the genome might be assembled as separate
primary contigs due to the high degree of heterozygosity (Roach,
Schmidt, & Borneman, 2018). To corrected these possible allelic
contigs, we reassigned the polished assembly using the pipeline Purge
Haplotigs to identify pairs of contigs that are syntenic and removed one
of them (Roach et al., 2018), resulting in a contig-level genome.
Clean reads sequenced from the Hi-C library were aligned to the
contig-level genome with an end-to-end algorithm implemented in Bowtie
v2.3.5 (Langmead & Salzberg, 2012) according to the HiC-Pro strategy
(Langmead & Salzberg, 2012; Servant et al., 2015). The Juicer v1.5 and
3D de novo assembly (3D-DNA) pipelines were used to assemble the
contigs into a chromosome-level genome (Dudchenko et al., 2017; Durand
et al., 2016). The completeness of the genome was evaluated through the
analysis of single-copy orthologs (Simao, Waterhouse, Ioannidis,
Kriventseva, & Zdobnov, 2015), implemented in Benchmarking Universal
Single-Copy Orthologs (BUSCO) v3.0.2 (Simao et al., 2015), based on the
insecta_odb9 database (1,658
genes). Synteny between PFM and Cydia pomonella (Lepidoptera:
Tortricidae) (Assembly accession: GCA_003425675.2) (Wan et al., 2019)
and Spodoptera litura (Assembly accession: GCF_002706865.1)
(Cheng et al., 2017) genomes were analyzed using TBtools v0.58 (C. Chen
et al., 2020).