Repeat analysis
There are different repeats types in genome sequences. So repeat sequences analysis was performed with different methods to find different repeat types. Firstly, simple sequence repeats (SSRs) were identified using the MIcroSAtellite Identification Tool (MISA) (Beier, et al. 2017). MISA can distinguish and locate both simple and compound microsatellites. Next, a combination of de novo -based and homology-based strategy was utilized to search other repeat sequences. RepeatModeler (v1.0.8) was applied in detecting repeat sequences as the de novo -based method and then, repeat sequences, which were found by RepeatModeler, were classified by TEclass (Abrusan, et al. 2009). These classified sequences were merged with Repbase sequences to construct a custom TE library (Jurka, et al. 2005). Finally, G. przewalskii genome took advantage of the custom TE library to annotate repeat sequences with RepeatMasker (http://repeatmasker.org).