XyGen: Synthetic data generator for feature selection[Formula presented]

Date
2023-03
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier B.V.
Abstract
Given the large number of feature selection algorithms, it has become imperative to have a uniform procedure for evaluating the performance of the algorithms. We propose a library of synthetic datasets designed specifically to test the effectiveness of feature selection algorithms. The datasets are inspired by applications in the field of electronics and have a range of characteristics to provide a variety of test scenarios. The software comes in the form of a Python library with standard interface for loading and generating datasets. Each dataset is implemented as a function that allows control of various parameters of the data. © 2023 The Author(s)
Description
Keywords
Data mining, Feature selection, Machine learning, Synthetic data
Citation
Kamalov, F., Elnaffar, S., Sulieman, H., & Cherukuri, A. K. (2023). XyGen: Synthetic data generator for feature selection. Software Impacts, 15, 100485. https://doi.org/10.1016/j.simpa.2023.100485
DOI