Rahul Nair

and 5 more

Background: Clinical trial simulations and pharmacometric modeling of biomarker profiles for under-represented groups are challenging because the underlying studies frequently do not have sufficient participants from these groups. Objectives: To investigate generative adversarial networks (GANs), an artificial intelligence (AI) technology that enables realistic simulations of complex patterns, for modeling clinical biomarker profiles of under-represented groups. Methods: GANs consist of generator and discriminator neural networks that operate in tandem. GAN architectures were developed for modeling univariate and joint distributions of a panel of 16 diabetes-relevant biomarkers from the National Health and Nutrition Examination Survey (NHANES), which contains laboratory and clinical biomarker data from a population-based sample of individuals of all ages, racial groups, and ethnicities. Conditional GANs were used to model biomarker profiles for race/ethnicity categories. GAN performance was assessed by comparing GAN outputs to test data. Results: The biomarkers exhibited non-normal distributions and varied in their bivariate correlation patterns. Univariate distributions were modeled with generator and discriminator neural networks consisting of two dense layers with rectified linear unit-activation. The distributions of GAN-generated biomarkers were similar to the test data distributions. The joint distributions of the biomarker panel in the GAN-generated data were dispersed and overlapped with the joint distribution of the test data as assessed by three multi-dimensional projection methods. Conditional GANs satisfactorily modeled the joint distribution of the biomarker panel in the Black, Hispanic, White, and “Other” race/ethnicity categories. Conclusions: GAN are a promising AI approach for generating virtual patient data with realistic biomarker distributions for under-represented race/ethnicity groups.