http://doi.org/10.1016/j.foodchem.2019.125819
Parra, G., Bradnam, K., & Korf, I. (2007). CEGMA: a pipeline to
accurately annotate core genes in eukaryotic
genomes. Bioinformatics , 23 (9), 1061–1067.
Price, A. L., Jones, N. C., & Pevzner, P. A. (2005). De novo
identification of repeat families in large genomes.Bioinformatics , 21 (suppl_1), i351–i358.https://doi.org/
10.1093/bioinformatics/bti1018
Pruitt, K. D., Tatusova, T., & Maglott, D. R. (2007). NCBI reference
sequences (RefSeq): a curated non-redundant sequence database of
genomes, transcripts and proteins. Nucleic Acids
Research , 35 (suppl_1), D61–D65.