Kernel density estimation-based sampling for neural network classification
Institute of Electrical and Electronics Engineers Inc.
Imbalanced data occurs in a wide range of scenarios. The skewed distribution of the target variable elicits bias in machine learning algorithms. One of the popular methods to combat imbalanced data is to artificially balance the data through resampling. In this paper, we compare the efficacy of a recently proposed kernel density estimation (KDE) sampling technique in the context of artificial neural networks. We benchmark the KDE sampling method against two base sampling techniques and perform comparative experiments using 8 datasets and 3 neural networks architectures. The results show that KDE sampling produces the best performance on 6 out of 8 datasets. However, it must be used with caution on image datasets. We conclude that KDE sampling is capable of significantly improving the performance of neural networks. © 2021 IEEE.
This conference paper is not available at CUD collection. The version of scholarly record of this paper is published in 2021 International Symposium on Networks, Computers and Communications, ISNCC (2021), available online at: https://doi.org/10.1109/ISNCC52172.2021.9615715
Deep learning, Imbalanced data, KDE, Kernel density estimation, Neural networks, Sampling
Kamalov, F., & Elnagar, A. (2021). Kernel density estimation-based sampling for neural network classification. 2021 International Symposium on Networks, Computers and Communications, ISNCC. https://doi.org/10.1109/ISNCC52172.2021.9615715