Optimizing ROC Curve for Ensemble Models through Pareto Front Analysis of the Decision Space

Alberto Gutierrez-Gallego; Oscar Garnica; Daniel Parra; J. Manuel Velasco; J. Ignacio Hidalgo

doi:10.22541/au.173132538.82748258/v1

loading page

Optimizing ROC Curve for Ensemble Models through Pareto Front Analysis of the Decision Space

Alberto Gutierrez-Gallego,
Oscar Garnica,
Daniel Parra,
J. Manuel Velasco,
J. Ignacio Hidalgo

Abstract

The ROC, Receiver Operating Characteristic, curve is commonly used to evaluate the performance of machine learning ensemble classification models that combine multiple classifiers and use a voting procedure to determine the final classification. Although they have many parameters, their ROC curves usually only explore the voting threshold, limiting their potential for improvement. In this paper we propose a new method, ROC mapping, to improve the performance of the model by re-defining the ROC curve as the Pareto front of a multi-objective optimization problem that maps the multidimensional space of all parameters of the ensemble classifier (Decision space), into the Objective space defined in the two-dimensional unitary interval. We use an algorithm based on NSGA-II to explore the Decision space and validate the proposal on two different classification problems: (1) predicting car insurance claims of a highly imbalanced dataset (Insurance dataset), and (2) predicting obesity risk with a balanced clinical dataset (GenObIA dataset). We compare our method with alternative ensemble optimization methods using the visual assessment, Area Under the Curve, and the Youden Index as figures of merit. In the Insurance dataset, our method shows an average improvement of 46 .4% in Area Under the Curve, and 26 .1% in the Youden Index, both calculated relative to the maximum achievable improvement. In the GenObIA dataset, we achieve an average increase of 29 .7% in Area Under the Curve, and 11 .9% in the Youden Index, again based on the maximum possible improvement. The ROC mapping approach provides a comprehensive and adaptable ROC curve, demonstrating its effectiveness in improving classification performance across different applications.

11 Nov 2024Submitted to Expert Systems

Show details

Hide details

11 Nov 2024Submission Checks Completed

11 Nov 2024Assigned to Editor

29 Nov 2024Reviewer(s) Assigned

10 Jan 2025Review(s) Completed, Editorial Evaluation Pending

11 Jan 2025Editorial Decision: Revise Major

Abstract

Peer review status:IN REVISION