Aziida Nanyonga

and 4 more

Safety is a critical aspect of the air transport system given even slight operational anomalies can result in serious consequences. To reduce the chances of aviation safety occurrences, accidents and incidents are reported to establish the root cause, propose safety recommendations etc. However, analysis narratives of the pre-accident events are presented using human-understandable, raw, unstructured, text that a computer system cannot understand. The ability to classify and categorise safety occurrences from their textual narratives would help aviation industry stakeholders make informed safety-critical decisions. To classify and categorise safety occurrences, we applied natural language processing (NLP) and AI (Artificial Intelligence) models to process text narratives. The study aimed to answer the question, "How well can the damage level caused to the aircraft in a safety occurrence be inferred from the text narrative using natural language processing?" The classification performance of various deep learning models including LSTM, BLSTM, GRU, sRNN, and combinations of these models including LSTM+GRU, BLSTM+GRU, sRNN+LSTM, sRNN+BLSTM, sRNN+GRU, sRNN+BLSTM+GRU, and sRNN+LSTM+GRU was evaluated on a set of 27,000 safety occurrence reports from the NTSB. The results of this study indicate that all models investigated performed competitively well recording an accuracy of over 87.9% which is well above the random guess of 25% for a four-class classification problem. Also, the models recorded high precision, recall, and F1 scores above 80%, 88%, and 85%, respectively. sRNN slightly outperformed other single models in terms of recall (90%) and accuracy (90%) while LSTM reported slightly better performance in terms of precision (87%). Further, GRU+LSTM and sRNN+BLSTM+GRU recorded the best performance in terms of recall (90%), and accuracy (90%) for joint models. These results suggest that the damage level can be inferred from the raw text narratives using NLP and deep learning models.