Enhancing Outlier Detection in Air Quality Index Data Using a Stacked Machine Learning Model

Abdoul Aziz Diallo; Lawrence  Nderu; Bonface  Malenje; Gideon  Kikuvi

doi:10.22541/au.169268979.99393086/v1

loading page

Enhancing Outlier Detection in Air Quality Index Data Using a Stacked Machine Learning Model

Abdoul Aziz Diallo,
Lawrence Nderu,
Bonface Malenje,
Gideon Kikuvi

Abstract

Air quality is an important part of environmental health, having serious consequences for human health and well-being. The Air Quality Index (AQI) is a frequently used metric for assessing air quality in various areas and at different times. However, AQI data, like many other types of environmental data, can contain outliers - data points that deviate significantly from other observations, indicating exceptionally good or poor air quality, a critical step in identifying and understanding extreme pollution episodes that can have serious environmental and public health consequences. These outliers can be caused by a variety of variables, including measurement mistakes, odd meteorological circumstances, and pollution occurrences. While outliers can occasionally give useful information about these unusual conditions, they can also skew studies and models if they are not adequately accounted for. This paper describes a hybrid method for detecting outliers in data, AQI data are used in this study. The model uses a stacked machine learning model that incorporates K-means clustering, Random Forest (RF), and Gradient Boosting Classifier (GBC). K-means is used for initial categorization, followed by RF model training, and ultimately, the RF output is used as input for the GBC to generate the final classification. The performance of this stacked machine learning model is examined and compared to single models using the Accuracy measure. The findings show that the suggested technique is efficient, with an accuracy of 0.99, showing its potential for effective outlier detection in data.

15 Aug 2023Submitted to Engineering Reports

Show details

Hide details

22 Aug 2023Submission Checks Completed

22 Aug 2023Assigned to Editor

24 Aug 2023Review(s) Completed, Editorial Evaluation Pending

11 Sep 2023Reviewer(s) Assigned

30 Oct 2023Editorial Decision: Revise Major

23 Feb 2024Review(s) Completed, Editorial Evaluation Pending

03 Apr 2024Editorial Decision: Revise Major

07 Apr 20242nd Revision Received

16 Apr 2024Submission Checks Completed

16 Apr 2024Assigned to Editor

16 Apr 2024Review(s) Completed, Editorial Evaluation Pending

17 Apr 2024Reviewer(s) Assigned

08 May 20243rd Revision Received

Abstract

Peer review status:IN REVISION