3.1 | Genome assembly, annotation and evaluation
A total of 29 Gb (110 X) PacBio long reads were generated (Table S1). The genome size was estimated to be 266.8 Mb based on K-mer depth distribution analysis (Tables S2 and Fig. S1), and the size of the assembled B. schroederi genome reached 262 Mb, accounting for 98.48% of the estimated genome. This genome contained 106 scaffolds, 21 of which were superscaffolds ligated using 105 Gb (~386 X) Hi-C sequencing data (Fig. 1a; Fig. S2). The total length of these 21 superscaffolds reached ~263 Mb, accounting for 98.67% of the whole genome (Table S3). The final scaffold N50 was 12.69 Mb (Table 1), which is significantly better than the published genome(Y. Hu et al., 2020). The GC-depth distribution (Fig. S3a and S3b) further showed that most genomic regions have a GC content narrowly centered around 37%, which is similar to that of other three roundworms(”Ascaris suum draft genome,” 2011; Zhu et al., 2015) (Ascaris suum ,Parascaris univalens and Toxocara canis ; Table 1). Benchmarking Universal Single-Copy Orthologs (BUSCO) scores against nematode and eukaryote databases were 91.9% and 93.4%, respectively (Fig. S4), reflecting the highest genome completeness among published roundworm genomes (Table 1). 19,291 protein-coding genes were predicted via ab initio, homology-based and RNA sequencing-aided methods (see Methods)(”serine-threonine kinase KIN-29 modulates TGFbeta signaling and regulates body size formation,”). Kyoto encyclopedia of genes and genomes (KEGG), clusters of orthologous groups of proteins (COG), TrEMBL, gene ontology (GO), Swissprot and InterPro (Fig. S5 and Fig. S6). The average length of coding sequences (CDS) was 1,052 bp with an average of 6.87 exons per gene, which is similar to that of other related roundworms (Table S4). Furthermore, a 14,767 bp mitochondrial genome was identified, containing 12 genes encoding proteins, 2 genes encoding rRNAs, and 22 genes encoding tRNAs (see methods; Table S5 and Fig. S7). To evaluate the completeness of the predicted protein-coding genes, we compared the length distributions of mRNA, CDS, exons and introns inB. schroederi with those in other five nematodes (Fig. S8 and Fig. S9).
Total repeats (DNA transposons and RNA transposons) accounted for 11.76% of the genome (Table S6-S8 and Fig. S10). Huge variation in the proportion of repeat content is found among published nematode genomes (from 1% to 48%) (Berriman et al., 2009; International, Helminth, Genomes, & Consortium, 2018; Schiffer, Kroiher, Kraus, Koutsovoulos, & Schierenberg, 2013). Transposable elements (TEs) account for 9.51% of the B. schroederi genome (Table S8), while TEs constitute 4.4% and 13.5% in the genomes of A. suum(”Ascaris suum draft genome,” 2011) and T. canis(Zhu et al., 2015) , respectively. We identified a significant expansion of DNA transposons in B. schroederi compared to T. canis (Zhu et al., 2015) and A. suum (”Ascaris suum draft genome,” 2011) (Supplementary Data 1). There are at least 64 DNA transposon families of which CMC-EnSpm, DNA and MULE-MuDR dominated the genome. We identified 17 long terminal repeats (LTRs) retrotransposon and 41 non-LTRs retrotransposon families (25 LINE and 16 SINE). Pao and Gypsy are the predominant LTRs, and CR1, RTE-RTE and L2 are the predominant non-LTRs. The number and size of the retrotransposon families were similar to those of other related roundworms(”Ascaris suum draft genome,” 2011; Ghedin et al., 2007; Zhu et al., 2015).