Visualization of the topics estimated by the LDA (Figure \ref{154862}) shows that the initial choice of 20 topics is reasonably spread out. As described in the original paper on LDAvis \cite{shirley2014}:  "In this view, we plot the topics as circles in the two-dimensional plane whose centers are determined by computing the distance between topics, and then by using multidimensional scaling to project the intertopic distances onto two dimensions, as is done in (Chuang et al., 2012a). We encode each topic’s overall prevalence using the areas of the circles, where we sort the topics in decreasing order of prevalence." Lambda is a relevance metric that can be adjusted to alter the rankings of terms in order to aid topic interpretation. The keyword frequencies are ranked in the right panel for the complete topic model, and hovering over an individual topic shows how that topic compares to the complete model.
Selecting the number of topics in an LDA analysis is an iterative process, and there is no formula for predicting what will be the “best” number of topics. Too few topics and they will be too general; too many, and they will be too specific. In other projects, we have used the pyLDAvis tool shown in Figure 2 to evaluate how well a topic model probably covers the topic space, looking at how much overlap there are between topics. As shown in Figure 2, there are some clusters, but only a couple of topics with significant overlap (5 and 15, which we labeled as “Third-party restrictions” and “Genetics Databases”). Based on our experience with other projects, this is a good starting point for a topic model.