Selecting the number of topics in an LDA analysis is an iterative process, and there is no formula for predicting what will be the “best” number of topics. Too few topics and they will be too general; too many, and they will be too specific. In other projects, we have used the pyLDAvis tool shown in Figure 2 to evaluate how well a topic model probably covers the topic space, looking at how much overlap there are between topics. As shown in Figure 2, there are some clusters, but only a couple of topics with significant overlap (5 and 15, which we labeled as “Third-party restrictions” and “Genetics Databases”). Based on our experience with other projects, this is a good starting point for a topic model.