Finite Sample Based Mutual Information

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Mutual information is a popular metric in machine learning. In case of a discrete target variable and a continuous feature variable the mutual information can be calculated as a sum-integral of weighted log likelihood ratio of joint and marginal density distributions. However, in practice the true density distributions are unavailable and only a finite sample of the population is given. In this paper, we propose a novel method for calculating the mutual information for continuous variables using a finite sample of the population. The proposed method is based on approximating the underlying continuous density distribution using Kernel Density Estimation. Unlike previous kernel-based approaches for estimating mutual information, our method calculates directly the integral involved in the formula. Numerical experiments demonstrate that the proposed method produces more accurate results than the currently used feature selection approaches. In addition, our method demonstrates substantially faster computation times than the benchmark methods. © 2013 IEEE.

Description

This article is not available at CUD collection. The version of scholarly record of this article is published in IEEE Access (2021), available online at: https://doi.org/10.1109/ACCESS.2021.3107031

Keywords

Continuous variable, feature evaluation, feature selection, finite sample, kernel density estimation, mutual information

Citation

Rajab, K., & Kamalov, F. (2021). Finite sample based mutual information. IEEE Access, 9, 118871-118879. https://doi.org/10.1109/ACCESS.2021.3107031

DOI