Discussion
Neonates who fail CPAP are at increased risk of pneumothorax, prolonged
respiratory support, bronchopulmonary dysplasia, and death18. Identifying these neonates shortly after birth is
crucial for implementing targeted strategies to reduce such
complications.
In our study, we created a machine learning model to predict the risk of
CPAP failure in preterm neonates born less 32 weeks gestation and or
with birth weight less than 1500 grams. To our knowledge, this is the
first study describing the utility of machine learning in prediction of
CPAP failure using large multicenter data. CPAP failure was defined as
the need for invasive mechanical ventilation within the first 72 hours
of life.
The rate of CPAP failure in our cohort was 64.1%. This is higher than
previously reported rates of 21% to 51% 10,19–22.
The wide range of these rates is likely related to institutional
differences in initial respiratory management and including older cohort
when practices such as routine intubation of infants of certain
gestational ages were common 21. Since our cohort was
comprised of infants from 27 neonatal intensive care units, we could not
account for variability between the centers in criteria for invasive
ventilation, and surfactant administration practices (including
indication and mode of surfactant delivery).
The strongest predictor of CPAP failure in our model was FiO2. In our
model, the highest risk of CPAP failure was at a FiO2 of 0.31. This is
consistent with prior studies that reported FiO2 in the first few hours
of life as predictive of CPAP failure, with FiO2 cutoffs ranging from
0.25 to 0.3 10,19,20,23.
We found an increasing risk of CPAP failure with elevated body
temperature. Other variables that had a high variable importance ranking
were systolic blood pressure and arterial PaO2. Several studies report
birth weight as a significant predictor for CPAP failure10,23,24, and it was one of the top 5 variables that
contributed to our prediction model. Neonates with a birth weight of
500-1000 grams had a higher risk of failing CPAP compared to those with
a birth weight of 1200-2000 grams. We also found that the risk of CPAP
failure decreased with increasing gestational age as expected.
Our study is unique in that it provides a machine learning model for
predicting CPAP failure using multicenter data. The benefit of machine
learning is that it accounts for complex relationships between variables
and does not assume that these predictors are independent of each other.
It can use large data sets, such as our cohort, to generate a model that
is validated as the algorithm is created 25. We tested
our model with a subset of our cohort and found that area under the
receiver operator characteristic curve was 0.91 and area under the
precision-recall curve was 0.93, indicating very high model performance.
There were several limitations in this study including the absence of a
standard definition of CPAP failure. Consequently, every multicenter
study may be limited by institutional differences in classifying
patients who failed CPAP and corresponding treatment recommendation.
The objective of this study is development of the earliest warning
system predicting risk of CPAP failure by using the very first set of
clinical data as predictors. While this is very helpful, there may be
increased accuracy but reduced time for intervention for predicting at
later times into the course of the NICU stay. Future multicenter studies
may provide more light into the effectiveness of longitudinal follow-up
till CPAP failure or discharge otherwise.
There were missing data which we accounted for in the choice of machine
learning algorithm. However, although these algorithms can capture
relationships between input variables in the presence of missing data,
the accuracy of the model may have been impacted. Due to the de
identification of data in OERWD, a link between neonatal charts with
maternal charts was not possible. Therefore, the model does not include
important perinatal factors, such as antenatal steroids exposure, mode
of delivery and Apgar scores.
In conclusion, we describe a machine learning model to predict CPAP
failure in preterm neonates with RDS. The most significant predictor in
our model was FiO2, where we observed a direct relationship between FiO2
and CPAP failure up until a value of 0.32 after which the chances of
CPAP failure remained relatively stable (see Figure 2). Other predictors
of importance were gestational age, birth weight and first PCO2 value.
We tested the model’s ability to predict CPAP failure in a subset of our
cohort and found high performance. Implementation of this model into the
electronic medical record can facilitate early prediction of CPAP
failure and identify neonates who may benefit from targeted
interventions to maximize the success of non-invasive ventilation.