Killian Murphy

and 2 more

Recently Network Fault Prediction (NFP) has arisen new scientific interest. The ability to predict network equipment failure is increasingly identified as an effective tool to improve network reliability. This predictive capability can then be used, to mitigate or to enact preventive maintenance on incoming network failures. This could enable the emergence of zero-failure networks and allow safety-critical applications to run over larger and higher complexity heterogeneous networks. In this paper, we present a comprehensive survey on Network Fault Prediction using Machine Learning (ML) methods entirely dedicated to telecommunication networks. Specifically, we first introduce the key concepts of NFP. Second, we present the specific constraints of a Network and System Integrator (NSI), such as heterogeneity, Quality of Service (QoS), maintenance costs, and guaranteed Time To Restore (TTR) according to equipment importance, of the networks. As a consequence, we propose a new metric of ML performance, measuring the expected reduction in maintenance cost based on a ML predictive model. Third, we provide a survey of the ML methods used in NFP studies, and procure recommendations regarding their use in telecommunication networks. Fourth, we propose a summary of the different types of NFP predictions regarding different NFP problems, such as Equipment Health, Network Health, Link Failure, or Alarm Prediction. Finally, we identify the need for zero-loss and zero-delay handover, and the necessity of additional prediction elements as future research directions. In conclusion, we summarize the state of the art, and the new findings of this survey.