Outlier Detection in High Dimensional Data

dc.contributor.authorKamalov, Firuz
dc.contributor.authorLeung, Ho Hon
dc.date.accessioned2021-02-07T13:14:43Z
dc.date.available2021-02-07T13:14:43Z
dc.date.copyright© 2020
dc.date.issued2020-03-01
dc.description.abstractHigh-dimensional data poses unique challenges in outlier detection process. Most of the existing algorithms fail to properly address the issues stemming from a large number of features. In particular, outlier detection algorithms perform poorly on dataset of small size with a large number of features. In this paper, we propose a novel outlier detection algorithm based on principal component analysis and kernel density estimation. The proposed method is designed to address the challenges of dealing with high-dimensional data by projecting the original data onto a smaller space and using the innate structure of the data to calculate anomaly scores for each data point. Numerical experiments on synthetic and real-life data show that our method performs well on high-dimensional data. In particular, the proposed method outperforms the benchmark methods as measured by F1-score. Our method also produces better-than-average execution times compared with the benchmark methods. © 2020 World Scientific Publishing Co.en_US
dc.identifier.citationKamalov, F., & Leung, H. H. (2020). Outlier detection in high dimensional data. Journal of Information & Knowledge Management, 19(1), 2040013. https://doi.org/10.1142/S0219649220400134en_US
dc.identifier.issn02196492
dc.identifier.urihttps://doi.org/10.1142/S0219649220400134
dc.identifier.urihttp://hdl.handle.net/20.500.12519/328
dc.language.isoenen_US
dc.publisherWorld Scientific Publishing Co. Pte Ltden_US
dc.relationAuthors Affiliations : Kamalov, F., Canadian University Dubai, Dubai, United Arab Emirates; Leung, H.H., UAE University, United Arab Emirates
dc.relation.ispartofseriesJournal of Information & Knowledge Management; Volume 19, Issue 1
dc.rightsCreative Commons Attribution 4.0 International License (CC BY 4.0)
dc.rights.holderCopyright : © 2020 World Scientific Publishing Co.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjecthigh dimensional dataen_US
dc.subjectKDEen_US
dc.subjectOutlier detectionen_US
dc.subjectPCAen_US
dc.subjectAnomaly detectionen_US
dc.subjectLarge dataseten_US
dc.subjectNumerical methodsen_US
dc.subjectPrincipal component analysisen_US
dc.subjectSignal detectionen_US
dc.subjectStatisticsen_US
dc.subjectData pointsen_US
dc.subjectInnate structureen_US
dc.subjectKernel Density Estimationen_US
dc.subjectNumerical experimentsen_US
dc.subjectOutlier detection algorithmen_US
dc.subjectOutlier detection in high-dimensional datumen_US
dc.subjectReal life dataen_US
dc.subjectClustering algorithmsen_US
dc.titleOutlier Detection in High Dimensional Dataen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Access Instruction 328.pdf
Size:
56.85 KB
Format:
Adobe Portable Document Format
Description: