Genome annotation
We identified 29,228 protein-coding genes in the 1st round of MAKER
annotation. BUSCO analysis revealed 91.9% of the evaluated single-copy
genes were identified as complete. After three rounds of MAKER
annotation, the number of genes increased to 52,667, while the
proportion of complete single-copy genes was up to 95.2%. After
filtering based on gene expression analysis, functional domains and AED
values, 23,218 genes remained. BUSCO analysis showed that 95.0%
(single-copied gene: 94.1%, duplicated gene: 1.1%) of the evaluated
single-copy genes were identified as complete, 1.6% of the genes were
fragmented, and 3.2% of the genes were missing in the annotated gene
set. In total, 19,206 genes (82.72%) were functionally annotated, of
which 5,970 (25.71%) and 3,134 (13.50%) genes annotated to GO terms
and KEGG KOs respectively. We predicted 53 rRNAs, 11,076 tRNAs, 20 small
nuclear RNAs, and 48 micro RNAs in the PFM genome based on Rfam
databases.
In total, 45.5 Mb (11.33%) of the genome was identified to be repeat
DNA. Overall, 259,729 transposable elements (TEs) including 125,601
retroelements (17,962 short interspersed nuclear elements (SINEs),
95,657 long interspersed nuclear elements (LINEs) and 11982 long
terminal repeats (LTR)) and 34,478 DNA transposons were identified.
Orthology and phylogenetic relationships of lepidopterans
OrthoFinder assigned 320,821 genes (93.41% of total) to 15,076
orthogroups for the 16 species compared. Fifty percent of the assigned
genes were in orthogroups with 28 or more genes (G50 was 28) and were
contained in the largest 3,174 orthogroups (O50 was 3,174).
There were 947 single-copy genes with 364,262 reliable sites retained
for phylogenetic inference. The topology is congruent with previously
inferred phylogenetic relationships of Lepidoptera, in which no
representative of the Copromorphoidea was included (Wan et al., 2019).
Current molecular phylogenetic studies have not resolved the
phylogenetic relationship between Copromorphoidea and Papilionoidea
(Mitter, Davis, & Cummings, 2017). Our result supports the notion that
PFM from the Copromorphoidea forms a sister-group relationship to the
butterfly D. plexippus (Papilionoidea), rather than a sister
group between Copromorphoidea/Papilionoidea and Pyraloidea + (Noctuoidea
+ Bombycoidea) (Fig. 3a).
We investigated orthogroups shared by PFM and four species of
Lepidoptera representing different clades of the phylogenetic tree of
Lepidoptera (Fig 3b). There were 7,827 orthogroups (60.5% of 12,938
orthogroups) shared by all five lepidopteran species and 1,549
orthogroups shared by four species except for C. pomonella . We
identified 357 orthogroups specific to PFM, fewer than that of B.
mori (406), but higher than other three lepidopteran species (Fig. 3b).