CUD Digital repository

Outlier Detection in High Dimensional Data

Show simple item record

dc.contributor.author Kamalov, Firuz
dc.contributor.author Leung, Ho Hon
dc.date.accessioned 2021-02-07T13:14:43Z
dc.date.available 2021-02-07T13:14:43Z
dc.date.copyright © 2020
dc.date.issued 2020-03-01
dc.identifier.citation Kamalov, F., & Leung, H. H. (2020). Outlier detection in high dimensional data. Journal of Information & Knowledge Management, 19(1), 2040013. https://doi.org/10.1142/S0219649220400134 en_US
dc.identifier.issn 02196492
dc.identifier.uri https://doi.org/10.1142/S0219649220400134
dc.identifier.uri http://hdl.handle.net/20.500.12519/328
dc.description This article is not available at CUD collection. The version of scholarly record of this Article is published in Journal of Information & Knowledge Management (2020), available online at: https://doi.org/10.1142/S0219649220400134 en_US
dc.description.abstract High-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on dataset of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by F1-score. Our method also produces better-than-average execution times compared with the benchmark methods. © 2020 World Scientific Publishing Co. en_US
dc.language.iso en en_US
dc.publisher World Scientific Publishing Co. Pte Ltd en_US
dc.relation Authors Affiliations : Kamalov, F., Canadian University Dubai, Dubai, United Arab Emirates; Leung, H.H., UAE University, United Arab Emirates
dc.relation.ispartofseries Journal of Information & Knowledge Management; Volume 19, Issue 1
dc.rights Creative Commons Attribution 4.0 International License (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject high dimensional data en_US
dc.subject KDE en_US
dc.subject Outlier detection en_US
dc.subject PCA en_US
dc.subject Anomaly detection en_US
dc.subject Large dataset en_US
dc.subject Numerical methods en_US
dc.subject Principal component analysis en_US
dc.subject Signal detection en_US
dc.subject Statistics en_US
dc.subject Data points en_US
dc.subject Innate structure en_US
dc.subject Kernel Density Estimation en_US
dc.subject Numerical experiments en_US
dc.subject Outlier detection algorithm en_US
dc.subject Outlier detection in high-dimensional datum en_US
dc.subject Real life data en_US
dc.subject Clustering algorithms en_US
dc.title Outlier Detection in High Dimensional Data en_US
dc.type Article en_US
dc.rights.holder Copyright : © 2020 World Scientific Publishing Co.


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Creative Commons Attribution 4.0 International License (CC BY 4.0) Except where otherwise noted, this item's license is described as Creative Commons Attribution 4.0 International License (CC BY 4.0)

Search


Browse

My Account

Statistics