Outlier Detection in High Dimensional Data

dc.contributor.author Kamalov, Firuz
dc.contributor.author Leung, Ho Hon
dc.date.accessioned 2021-02-07T13:14:43Z
dc.date.available 2021-02-07T13:14:43Z
dc.date.copyright © 2020
dc.date.issued 2020-03-01
dc.description This article is not available at CUD collection. The version of scholarly record of this Article is published in Journal of Information & Knowledge Management (2020), available online at: https://doi.org/10.1142/S0219649220400134 en_US
dc.description.abstract High-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on dataset of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by F1-score. Our method also produces better-than-average execution times compared with the benchmark methods. © 2020 World Scientific Publishing Co. en_US
dc.identifier.citation Kamalov, F., & Leung, H. H. (2020). Outlier detection in high dimensional data. Journal of Information & Knowledge Management, 19(1), 2040013. https://doi.org/10.1142/S0219649220400134 en_US
dc.identifier.issn 02196492
dc.identifier.uri https://doi.org/10.1142/S0219649220400134
dc.identifier.uri http://hdl.handle.net/20.500.12519/328
dc.language.iso en en_US
dc.publisher World Scientific Publishing Co. Pte Ltd en_US
dc.relation Authors Affiliations : Kamalov, F., Canadian University Dubai, Dubai, United Arab Emirates; Leung, H.H., UAE University, United Arab Emirates
dc.relation.ispartofseries Journal of Information & Knowledge Management; Volume 19, Issue 1
dc.rights Creative Commons Attribution 4.0 International License (CC BY 4.0)
dc.rights.holder Copyright : © 2020 World Scientific Publishing Co.
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject high dimensional data en_US
dc.subject KDE en_US
dc.subject Outlier detection en_US
dc.subject PCA en_US
dc.subject Anomaly detection en_US
dc.subject Large dataset en_US
dc.subject Numerical methods en_US
dc.subject Principal component analysis en_US
dc.subject Signal detection en_US
dc.subject Statistics en_US
dc.subject Data points en_US
dc.subject Innate structure en_US
dc.subject Kernel Density Estimation en_US
dc.subject Numerical experiments en_US
dc.subject Outlier detection algorithm en_US
dc.subject Outlier detection in high-dimensional datum en_US
dc.subject Real life data en_US
dc.subject Clustering algorithms en_US
dc.title Outlier Detection in High Dimensional Data en_US
dc.type Article en_US
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.01 KB
Format:
Item-specific license agreed upon to submission
Description: