Outlier Detection in High Dimensional Data
dc.contributor.author | Kamalov, Firuz | |
dc.contributor.author | Leung, Ho Hon | |
dc.date.accessioned | 2021-02-07T13:14:43Z | |
dc.date.available | 2021-02-07T13:14:43Z | |
dc.date.copyright | © 2020 | |
dc.date.issued | 2020-03-01 | |
dc.description.abstract | High-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on dataset of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by F1-score. Our method also produces better-than-average execution times compared with the benchmark methods. © 2020 World Scientific Publishing Co. | en_US |
dc.identifier.citation | Kamalov, F., & Leung, H. H. (2020). Outlier detection in high dimensional data. Journal of Information & Knowledge Management, 19(1), 2040013. https://doi.org/10.1142/S0219649220400134 | en_US |
dc.identifier.issn | 02196492 | |
dc.identifier.uri | https://doi.org/10.1142/S0219649220400134 | |
dc.identifier.uri | http://hdl.handle.net/20.500.12519/328 | |
dc.language.iso | en | en_US |
dc.publisher | World Scientific Publishing Co. Pte Ltd | en_US |
dc.relation | Authors Affiliations : Kamalov, F., Canadian University Dubai, Dubai, United Arab Emirates; Leung, H.H., UAE University, United Arab Emirates | |
dc.relation.ispartofseries | Journal of Information & Knowledge Management; Volume 19, Issue 1 | |
dc.rights | Creative Commons Attribution 4.0 International License (CC BY 4.0) | |
dc.rights.holder | Copyright : © 2020 World Scientific Publishing Co. | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | high dimensional data | en_US |
dc.subject | KDE | en_US |
dc.subject | Outlier detection | en_US |
dc.subject | PCA | en_US |
dc.subject | Anomaly detection | en_US |
dc.subject | Large dataset | en_US |
dc.subject | Numerical methods | en_US |
dc.subject | Principal component analysis | en_US |
dc.subject | Signal detection | en_US |
dc.subject | Statistics | en_US |
dc.subject | Data points | en_US |
dc.subject | Innate structure | en_US |
dc.subject | Kernel Density Estimation | en_US |
dc.subject | Numerical experiments | en_US |
dc.subject | Outlier detection algorithm | en_US |
dc.subject | Outlier detection in high-dimensional datum | en_US |
dc.subject | Real life data | en_US |
dc.subject | Clustering algorithms | en_US |
dc.title | Outlier Detection in High Dimensional Data | en_US |
dc.type | Article | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Access Instruction 328.pdf
- Size:
- 56.85 KB
- Format:
- Adobe Portable Document Format
- Description: