Genome Repeat Analysis and Gene Prediction
Overall, repetitive sequences accounted for 66.32% of G. przewalskii genome. LTR elements consisted 53.22% of the G. przewalskii genome as the most abundant repeat class (Table S6). Used the gene prediction method and obtaining gene structure (Table 2). Finally, 27,224 protein-coding genes were identified in G. przewalskii genome, with an average of 19,299.98 bp in length, 10.13 exons per gene (Table S7). The completeness of the annotation was evaluated by BUSCO (v3.0.1)(Simao, et al. 2015). The result of BUSCO analysis proved that our annotation covered 88% complete BUSCOs. The distribution of genes and repeats was show in the circos (Figure 2).