KDE-Based Ensemble Learning for Imbalanced Data

Date
2022-09
Authors
Kamalov, Firuz
Moussa, Sherif
Avante Reyes, Jorge
Journal Title
Journal ISSN
Volume Title
Publisher
MDPI
Abstract
Imbalanced class distribution affects many applications in machine learning, including medical diagnostics, text classification, intrusion detection and many others. In this paper, we propose a novel ensemble classification method designed to deal with imbalanced data. The proposed method trains each tree in the ensemble using uniquely generated synthetically balanced data. The data balancing is carried out via kernel density estimation, which offers a natural and effective approach to generating new sample points. We show that the proposed method results in a lower variance of the model estimator. The proposed method is tested against benchmark classifiers on a range of simulated and real-life data. The results of experiments show that the proposed classifier significantly outperforms the benchmark methods. © 2022 by the authors.
Description
This article is licensed under Creative Commons License and full text is openly accessible in CUD Digital Repository. The version of the scholarly record of this work is published in Electronics (Switzerland) (2022), available online at: https://doi.org/10.3390/electronics11172703
Keywords
data sampling, ensemble method, imbalanced data, kernel density estimate
Citation
Kamalov, F., Moussa, S., & Avante Reyes, J. (2022). KDE-based ensemble learning for imbalanced data. Electronics (Switzerland), 11(17). https://doi.org/10.3390/electronics11172703.