Finite Sample Based Mutual Information
Institute of Electrical and Electronics Engineers Inc.
Mutual information is a popular metric in machine learning. In case of a discrete target variable and a continuous feature variable the mutual information can be calculated as a sum-integral of weighted log likelihood ratio of joint and marginal density distributions. However, in practice the true density distributions are unavailable and only a finite sample of the population is given. In this paper, we propose a novel method for calculating the mutual information for continuous variables using a finite sample of the population. The proposed method is based on approximating the underlying continuous density distribution using Kernel Density Estimation. Unlike previous kernel-based approaches for estimating mutual information, our method calculates directly the integral involved in the formula. Numerical experiments demonstrate that the proposed method produces more accurate results than the currently used feature selection approaches. In addition, our method demonstrates substantially faster computation times than the benchmark methods. © 2013 IEEE.
This article is not available at CUD collection. The version of scholarly record of this article is published in IEEE Access (2021), available online at: https://doi.org/10.1109/ACCESS.2021.3107031
Continuous variable, feature evaluation, feature selection, finite sample, kernel density estimation, mutual information
Rajab, K., & Kamalov, F. (2021). Finite sample based mutual information. IEEE Access, 9, 118871-118879. https://doi.org/10.1109/ACCESS.2021.3107031