Feature Selection in Imbalanced Data

dc.contributor.authorKamalov, Firuz
dc.contributor.authorThabtah, Fadi
dc.contributor.authorLeung, Ho Hon
dc.date.accessioned2022-02-16T15:29:29Z
dc.date.available2022-02-16T15:29:29Z
dc.date.copyright© 2021
dc.date.issued2023-12
dc.description.abstractThe traditional feature selection methods are not suitable for imbalanced data as they tend to be biased towards the majority class. This problem is particularly acute in the field of medical diagnostics and fraud detection where the class distribution is highly skewed. In this paper, we propose a novel filter approach using decision tree-based F1-score. The F1-score incorporates the accuracy with respect to the minority class data and hence is a good measure in the case of imbalanced data. In the proposed implementation, the F1-score is calculated based on a 1-dimensional decision tree classifier resulting in a fast and effective feature evaluation method. Numerical experiments confirm that the proposed method achieves robust dimensionality reduction and accuracy results. In addition, the low computational complexity of the algorithm makes it a practical choice for big data applications. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.en_US
dc.identifier.citationKamalov, F., Thabtah, F., & Leung, H. H. (2023). Feature selection in imbalanced data. Annals of Data Science, 10(6), 1527-1541. https://doi.org/10.1007/s40745-021-00366-5en_US
dc.identifier.issn21985804
dc.identifier.urihttps://doi.org/10.1007/s40745-021-00366-5
dc.identifier.urihttp://hdl.handle.net/20.500.12519/514
dc.language.isoenen_US
dc.publisherSpringer Science and Business Media Deutschland GmbHen_US
dc.relationAuthors Affiliations : Kamalov, F., Canadian University of Dubai, Dubai, United Arab Emirates; Thabtah, F., Manukau Institute of Technology, Manukau, New Zealand; Leung, H.H., UAE University, Al Ain, United Arab Emirates
dc.relation.ispartofseriesAnnals of Data Science;
dc.rightsLicense to reuse the abstract has been secured from Springer Nature and Copyright Clearance Center.
dc.rights.holderCopyright : © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
dc.subjectBig dataen_US
dc.subjectData miningen_US
dc.subjectF1-scoreen_US
dc.subjectFeature selectionen_US
dc.subjectFilter methoden_US
dc.subjectImbalanced dataen_US
dc.subjectMachine learningen_US
dc.titleFeature Selection in Imbalanced Dataen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Access Instruction - 514.pdf
Size:
102.17 KB
Format:
Adobe Portable Document Format
Description: