A Feature Selection Method Based on Ranked Vector Scores of Features for Classification

Kamalov, Firuz; Thabtah, Fadi

A Feature Selection Method Based on Ranked Vector Scores of Features for Classification

dc.contributor.author	Kamalov, Firuz
dc.contributor.author	Thabtah, Fadi
dc.date.accessioned	2020-12-12T06:45:20Z
dc.date.available	2020-12-12T06:45:20Z
dc.date.copyright	© 2017
dc.date.issued	2017-12-01
dc.description	This article is not available at CUD collection. The version of scholarly record of this article is published in Annals of Data Science (2017), available online at: https://doi.org/10.1007/s40745-017-0116-1	en_US
dc.description.abstract	One of the major aspects of any classification process is selecting the relevant set of features to be used in a classification algorithm. This initial step in data analysis is called the feature selection process. Disposing of the irrelevant features from the dataset will reduce the complexity of the classification task and will increase the robustness of the decision rules when applied on the test set. This paper proposes a new filtering method that combines and normalizes the scores of three major feature selection methods: information gain, chi-squared statistic and inter-correlation. Our method utilizes the strengths of each of the aforementioned methods to maximum advantage while avoiding their drawbacks—especially the disparity of the results produced by these methods. Our filtering method stabilizes each variable score and gives it the true rank among the input data’s available variables. Hence it maximizes the stability in the variables’ scores without losing the overall accuracy of the predictive model. A number of experiments on different datasets from various domains have shown that features chosen by the proposed method are highly predictive when compared with features selected by other existing filtering methods. The evaluation of the filtering phase was conducted via thorough experimentations using a number of predictive classification algorithms in addition to statistical analysis of the filtering methods’ scores. © 2017, Springer-Verlag GmbH Germany.	en_US
dc.identifier.citation	Kamalov, F. &Thabtah, F. (2017). A Feature Selection Method Based on Ranked Vector Scores of Features for Classification. Annals of Data Science 4(1), 483–502. https://doi.org/10.1007/s40745-017-0116-1	en_US
dc.identifier.issn	21985804
dc.identifier.uri	https://doi.org/10.1007/s40745-017-0116-1
dc.identifier.uri	http://hdl.handle.net/20.500.12519/300
dc.language.iso	en	en_US
dc.publisher	Springer Science and Business Media Deutschland GmbH	en_US
dc.relation	Authors Affiliations : Kamalov, F., Canadian University of Dubai, Dubai, United Arab Emirates; Thabtah, F., University of Huddersfield, Huddersfield, United Kingdom
dc.relation.ispartofseries	Annals of Data Science;Volume 4, Issue 4
dc.rights	Permission to reuse the abstract has been secured from Springer Science and Business Media Deutschland GmbH
dc.rights.holder	Copyright : © 2017, Springer-Verlag GmbH Germany
dc.subject	Classification accuracy	en_US
dc.subject	Data mining	en_US
dc.subject	Dimensionality reduction	en_US
dc.subject	Feature selection	en_US
dc.subject	Predictive models	en_US
dc.subject	Ranking of features	en_US
dc.title	A Feature Selection Method Based on Ranked Vector Scores of Features for Classification	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Access Instruction 300.pdf
Size:: 102.84 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Department of Electrical Engineering