Discussion
Neonates who fail CPAP are at increased risk of pneumothorax, prolonged respiratory support, bronchopulmonary dysplasia, and death18. Identifying these neonates shortly after birth is crucial for implementing targeted strategies to reduce such complications.
In our study, we created a machine learning model to predict the risk of CPAP failure in preterm neonates born less 32 weeks gestation and or with birth weight less than 1500 grams. To our knowledge, this is the first study describing the utility of machine learning in prediction of CPAP failure using large multicenter data. CPAP failure was defined as the need for invasive mechanical ventilation within the first 72 hours of life.
The rate of CPAP failure in our cohort was 64.1%. This is higher than previously reported rates of 21% to 51% 10,19–22. The wide range of these rates is likely related to institutional differences in initial respiratory management and including older cohort when practices such as routine intubation of infants of certain gestational ages were common 21. Since our cohort was comprised of infants from 27 neonatal intensive care units, we could not account for variability between the centers in criteria for invasive ventilation, and surfactant administration practices (including indication and mode of surfactant delivery).
The strongest predictor of CPAP failure in our model was FiO2. In our model, the highest risk of CPAP failure was at a FiO2 of 0.31. This is consistent with prior studies that reported FiO2 in the first few hours of life as predictive of CPAP failure, with FiO2 cutoffs ranging from 0.25 to 0.3 10,19,20,23.
We found an increasing risk of CPAP failure with elevated body temperature. Other variables that had a high variable importance ranking were systolic blood pressure and arterial PaO2. Several studies report birth weight as a significant predictor for CPAP failure10,23,24, and it was one of the top 5 variables that contributed to our prediction model. Neonates with a birth weight of 500-1000 grams had a higher risk of failing CPAP compared to those with a birth weight of 1200-2000 grams. We also found that the risk of CPAP failure decreased with increasing gestational age as expected.
Our study is unique in that it provides a machine learning model for predicting CPAP failure using multicenter data. The benefit of machine learning is that it accounts for complex relationships between variables and does not assume that these predictors are independent of each other. It can use large data sets, such as our cohort, to generate a model that is validated as the algorithm is created 25. We tested our model with a subset of our cohort and found that area under the receiver operator characteristic curve was 0.91 and area under the precision-recall curve was 0.93, indicating very high model performance.
There were several limitations in this study including the absence of a standard definition of CPAP failure. Consequently, every multicenter study may be limited by institutional differences in classifying patients who failed CPAP and corresponding treatment recommendation.
The objective of this study is development of the earliest warning system predicting risk of CPAP failure by using the very first set of clinical data as predictors. While this is very helpful, there may be increased accuracy but reduced time for intervention for predicting at later times into the course of the NICU stay. Future multicenter studies may provide more light into the effectiveness of longitudinal follow-up till CPAP failure or discharge otherwise.
There were missing data which we accounted for in the choice of machine learning algorithm. However, although these algorithms can capture relationships between input variables in the presence of missing data, the accuracy of the model may have been impacted. Due to the de identification of data in OERWD, a link between neonatal charts with maternal charts was not possible. Therefore, the model does not include important perinatal factors, such as antenatal steroids exposure, mode of delivery and Apgar scores.
In conclusion, we describe a machine learning model to predict CPAP failure in preterm neonates with RDS. The most significant predictor in our model was FiO2, where we observed a direct relationship between FiO2 and CPAP failure up until a value of 0.32 after which the chances of CPAP failure remained relatively stable (see Figure 2). Other predictors of importance were gestational age, birth weight and first PCO2 value. We tested the model’s ability to predict CPAP failure in a subset of our cohort and found high performance. Implementation of this model into the electronic medical record can facilitate early prediction of CPAP failure and identify neonates who may benefit from targeted interventions to maximize the success of non-invasive ventilation.