Kamalov, FiruzElnagar, AshrafLeung, Ho Hon2021-10-172021-10-17© 20212021Kamalov, F., Elnagar, A., & Leung, H. H. (2021). Ensemble Learning with Resampling for Imbalanced Data. In D.-S. Huang, K.-H. Jo, J. Li, V. Gribova, & A. Hussain (Eds.), Intelligent Computing Theories and Application. ICIC 2021. Lecture Notes in Computer Science, vol 12837 (pp. 564-578). Springer, Cham. https://doi.org/10.1007/978-3-030-84529-2_48978-303084528-503029743https://doi.org/10.1007/978-3-030-84529-2_48http://hdl.handle.net/20.500.12519/452Imbalanced class distribution is an issue that appears in various applications. In this paper, we undertake a comprehensive study of the effects of sampling on the performance of bootstrap aggregating in the context of imbalanced data. Concretely, we carry out a comparison of sampling methods applied to single and ensemble classifiers. The experiments are conducted on simulated and real-life data using a range of sampling methods. The contributions of the paper are twofold: i) demonstrate the effectiveness of ensemble techniques based on resampled data over a single base classifier and ii) compare the effectiveness of different resampling techniques when used during the bagging stage for ensemble classifiers. The results reveal that ensemble methods overwhelmingly outperform single classifiers based on resampled data. In addition, we discover that NearMiss and random oversampling (ROS) are the optimal sampling algorithms for ensemble learning. © 2021, Springer Nature Switzerland AG.enLicense to reuse abstract has been provided by Springer Nature and Copyright Clearance Center.Data preprocessing samplingEnsemble methodImbalanced dataOversamplingUndersamplingEnsemble Learning with Resampling for Imbalanced DataConference PaperCopyright : © 2021, Springer Nature Switzerland AG.