Barron JJ

and 8 more

The Healthcare Integrated Research Database (HIRD) is a real-world data source for health-related research. Data elements of the HIRD, including sourcing, timeliness, and validity, demographic and healthcare-related characteristics of individuals in the HIRD, and aspects of the HIRD relevant to real-world evidence generation, are described. The HIRD includes health insurance claims and other health-related information for individuals enrolled in health insurance plans offered or managed by Elevance Health and has been utilized for research for almost two decades. Individuals in the HIRD reside throughout the United States. Data in the HIRD are available since January 1, 2006, and are updated monthly. As of July 31, 2024, the researchable population of the HIRD included over 91 million individuals with medical benefits, and over 24 million individuals were actively enrolled. The median age of individuals in the HIRD is 36 years (interquartile range, IQR: 22, 54) and 50% of individuals in the HIRD are female. The median duration of continuous enrollment in the HIRD is 2.0 years (IQR: 0.8, 4.7). For those actively enrolled, the median duration of continuous enrollment is 3.8 years (IQR: 1.7, 8.3). Other important characteristics of the HIRD include the ability to trace data back to their source, to support both deterministic and probabilistic linkage with external data sources, and to link family members within health plans. The HIRD has been a trusted resource to generate real-world evidence for a variety of health-related research, including regulatory required safety studies, comparative effectiveness studies, and health outcomes and economics research.

Aziza Jamal-Allial

and 8 more

Purpose Ascertainment of mortality is critical to epidemiologic studies. Secondary collected database studies face challenges given the need to record mortality data in health claims or electronic medical records. The National Death Index (NDI) is the gold standard for mortality data in the U.S. Methods This study is a secondary analysis of an advanced cancer cohort in the U.S. between January 2010 and December 2018, with an established NDI linkage. Mortality data sources, inpatient discharge, disenrollment, death master file (DMF), Center for Medicare and Medicaid Services (CMS), Utilization management data (U.M.), and online obituary data were compared to NDI. We calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and 95% confidence intervals (95% CI). Per each source, death identified 60 days before and 30 days after NDI death was deemed a match. Results Among 40,692 patients, 25,761 (63.3%) had a death date using NDI; the composite algorithm had a sensitivity of 88.9% (95% CI= 88.5%, 89.3%), specificity was 89.1% (95% CI= 88.6%, 89.6%). At the same time, PPV was 93.4% (95% CI= 93.1%, 93.7%), NPV was 82.3% (95% CI= 81.7%, 82.9%), and when comparing each individual source, each had a high PPV but limited sensitivity. Conclusion The composite algorithm was demonstrated to be a sensitive and precise measure of mortality in advanced cancer patients in the U.S. from 2010 to 2018, while individual database sources were accurate but had limited sensitivity compared to the NDI.