Genome assembly and evaluation
Long reads generated from PacBio sequencing were corrected and assembled using CANU version 1.8 (Koren et al., 2017) with default parameters. The initial assembly was polished using Pilon v1.22 (Walker et al., 2014) with short reads from Illumina paired-end sequencing for three times. Two haplotypes in part of the genome might be assembled as separate primary contigs due to the high degree of heterozygosity (Roach, Schmidt, & Borneman, 2018). To corrected these possible allelic contigs, we reassigned the polished assembly using the pipeline Purge Haplotigs to identify pairs of contigs that are syntenic and removed one of them (Roach et al., 2018), resulting in a contig-level genome.
Clean reads sequenced from the Hi-C library were aligned to the contig-level genome with an end-to-end algorithm implemented in Bowtie v2.3.5 (Langmead & Salzberg, 2012) according to the HiC-Pro strategy (Langmead & Salzberg, 2012; Servant et al., 2015). The Juicer v1.5 and 3D de novo assembly (3D-DNA) pipelines were used to assemble the contigs into a chromosome-level genome (Dudchenko et al., 2017; Durand et al., 2016). The completeness of the genome was evaluated through the analysis of single-copy orthologs (Simao, Waterhouse, Ioannidis, Kriventseva, & Zdobnov, 2015), implemented in Benchmarking Universal Single-Copy Orthologs (BUSCO) v3.0.2 (Simao et al., 2015), based on the insecta_odb9 database (1,658 genes). Synteny between PFM and Cydia pomonella (Lepidoptera: Tortricidae) (Assembly accession: GCA_003425675.2) (Wan et al., 2019) and Spodoptera litura (Assembly accession: GCF_002706865.1) (Cheng et al., 2017) genomes were analyzed using TBtools v0.58 (C. Chen et al., 2020).