Results
There were 24,127 preterm infants who qualified for inclusion into the
cohort from 27 neonatal intensive care units in the US. This comprised
of infants born between February 2002 and March 2023. The cohort
consisted of 15,457 (64.1%) infants who failed CPAP. Among the infants
who failed CPAP, there were 2% American Indian or Alaskan, 2.5% Asian,
15.8% Black or African American, 0.2% Native Hawaiian or Other Pacific
Islander, 63.6% White or Caucasian, 2.2% with multiple races, and
13.7% other or unknown race. The ethnic distribution was 30.6% infants
of Hispanic or Latino origin with 14.4% of unknown ethnic group. Of
those in the cohort, 47.3% were assigned female at birth (52.7% as
males); 50.4% were on governmental insurance plans; 28.6% on
Commercial/private insurance; 1.8% on Self-pay; and 19.2% with other
or unknown insurance. Please refer to Table 1 for summary statistics on
the entire cohort.
Splitting the data into a training (75%) and a test (25%) set results
in data from 18,095 patients in the training set and 6,032 in the test
set. Please refer to Table 2 on the Bivariate statistics by CPAP failure
status on the data used for training the model. Hyperparameter tuning
resulted in the selection of an XGBoost model with 64 trees, maximum
tree depth of 6, and learning rate of 0.1. Feature importance is shown
in Figure 1 indicating that the strongest predictors of CPAP failure as
FiO2, systolic blood pressure, body temperature, PaO2, birthweight,
oxygen saturation, gestational age, and heart rate.
The SHAP values indicate that there is increased risk for CPAP failure
as FiO2 increases up to 32%. This risk was sustained for higher values
of FiO2. In a similar way, the SHAP values indicate an increase in risk
with fever or higher body temperature. Values of birth weight between
500 grams and 1000 grams exposed patients to higher risk of failure
compared to those with birthweight between 1200 and 2000 grams. Most
patients had oxygen saturation between 60 and 100% such that risk for
failure reduced with increase in oxygen saturation levels. Similarly,
the risk for failure decreased with increase gestational age. SHAP
values for heart rate indicate a non-linear relationship such that
lower-than-normal heart rates were associated with higher risk of
failure. Values between 30 to 60 mmHg for pCO2 were associated with
lower risk of failure compared to higher values. Higher respiratory
rates were associated with reduced risk of failure. And infants assigned
male at birth were at higher risk of failing CPAP. The SHAP values of
the top 12 features (excluding insurance/payer) are shown in Figure 2.
Model performance on the independent test set using the area under the
receiver operator characteristic curve was 0.91 (95% CI: 0.90, 0.92).
Balancing sensitivity and specificity (in determining a classification
threshold) results in sensitivity of 0.86 (95% CI: 0.85, 0.87);
specificity of 0.82 (95% CI: 0.80, 0.83); positive predictive value of
0.89 (95% CI: 0.88, 0.90); negative predictive value of 0.76 (95% CI:
0.75, 0.78); F1 score of 0.87; predicted probability threshold for
classification at 0.69; and 10 of 11 predictions were true positive
predictions in the test set. Area under the precision-recall curve was
0.93. A visual representation of the areas under the curves is shown in
Figure 3.