Abstract:Sophora tonkinensis (shandougen) is a woody leguminous plant widely known for its medicinal values in China. The genome of various legumes utilized as reference genetic maps for pseudomolecule assembly have been published. However, the genome of Sophora has not been mapped. In this study, we reported a chromosomal scale draft genome of S. tonkinensis assembled using PacBio single-molecule real-time sequencing reads and Hi-C technique. A high-quality draft S. tonkinensis genome of 899Mb in size was obtained, which was larger than those of some other leguminous genome, and the BUSCO analysis reviewed 95.9% completeness of the genome. We annotated 78.3% of the genome as repeat elements and transposable elements occupied 73%. A total of 36,410 protein-coding genes were identified in the S. tonkinensis genome. The comparative analysis on genome size and repetitive sequences of S. tonkinensis and four other legumes (Lupinus albus, Lupinus angustifolius, Glycyrrhiza uralensis and Medicago truncatula) revealed that the transposable elements (TEs) in S. tonkinensis were inserted after the whole genome duplication and after differentiation with other legumes. It can be speculated that the size of the S. tonkinensis genome may be related to the repetitive sequence insertion. We also analyzed matrine and flavonoids which are important compounds in S. tonkinensis. We further analyzed lignin and Nitrogen-fixing gene which plays an important role in the adaptation of S. tonkinensis to the environment. In conclusion, the high-quality genome of S. tonkinensis obtained in this study laid the foundation for genetic and molecular biology studies of legumes.