Performances of k-means clustering algorithm with different distance metrics

Ghazal, Taher M.; Hussain, Muhammad Zahid; Said, Raed A.; Nadeem, Afrozah; Hasan, Mohammad Kamrul; Ahmad, Munir; Khan, Muhammad Adnan; Naseem, Muhammad Tahir

Performances of k-means clustering algorithm with different distance metrics

dc.contributor.author	Ghazal, Taher M.
dc.contributor.author	Hussain, Muhammad Zahid
dc.contributor.author	Said, Raed A.
dc.contributor.author	Nadeem, Afrozah
dc.contributor.author	Hasan, Mohammad Kamrul
dc.contributor.author	Ahmad, Munir
dc.contributor.author	Khan, Muhammad Adnan
dc.contributor.author	Naseem, Muhammad Tahir
dc.date.accessioned	2021-09-12T12:10:43Z
dc.date.available	2021-09-12T12:10:43Z
dc.date.issued	2021
dc.description.abstract	Clustering is the process of grouping the data based on their similar properties. Meanwhile, it is the categorization of a set of data into similar groups (clusters), and the elements in each cluster share similarities, where the similarity between elements in the same cluster must be smaller enough to the similarity between elements of different clusters. Hence, this similarity can be considered as a distance measure. One of the most popular clustering algorithms is K-means, where distance is measured between every point of the dataset and centroids of clusters to find similar data objects and assign them to the nearest cluster. Further, there are a series of distance metrics that can be applied to calculate point-to-point distances. In this research, the K-means clustering algorithm is evaluated with three different mathematical metrics in terms of execution time with different datasets and different numbers of clusters. The results indicate that the implementation of Manhattan distance measure metrics achieves the best results in most cases. These results also demonstrate that distance metrics can affect the execution time and the number of clusters created by the K-means algorithm. © 2021, Tech Science Press. All rights reserved.	en_US
dc.identifier.citation	Ghazal, T. M., Hussain, M. Z., Said, R. A., Nadeem, A., Hasan, M. K., Ahmad, M., . . . Naseem, M. T. (2021). Performances of k-means clustering algorithm with different distance metrics. Intelligent Automation and Soft Computing, 30(2), 735-742. https://doi.org/10.32604/iasc.2021.019067	en_US
dc.identifier.issn	10798587
dc.identifier.uri	https://doi.org/10.32604/iasc.2021.019067
dc.identifier.uri	http://hdl.handle.net/20.500.12519/440
dc.language.iso	en	en_US
dc.publisher	Tech Science Press	en_US
dc.relation	Authors Affiliations : Ghazal, T.M., Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebansaan Malaysia (UKM), Bangi, Selangor, 43600, Malaysia, School of Information Technology, Skyline University College, University City Sharjah, Sharjah, 1797, United Arab Emirates; Hussain, M.Z., Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore, 54000, Pakistan; Said, R.A., Canadian University Dubai, Dubai, United Arab Emirates; Nadeem, A., Department of Computer Science, Lahore Garrison University, Lahore, 54000, Pakistan; Hasan, M.K., Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebansaan Malaysia (UKM), Bangi, Selangor, 43600, Malaysia; Ahmad, M., School of Computer Science, National College of Business Administration & Economics, Lahore, 54000, Pakistan; Khan, M.A., Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore, 54000, Pakistan, Pattern Recognition and Machine Learning Lab, Department of Software Engineering, Gachon University, Seongnam, 13557, South Korea; Naseem, M.T., Riphah School of Computing & Innovation, Faculty of Computing, Riphah International University, Lahore Campus, Lahore, 54000, Pakistan
dc.relation.ispartofseries	Intelligent Automation and Soft Computing ; Volume 30, Issue 2
dc.rights	Creative Commons Attribution 4.0 International License
dc.rights.holder	Copyright : © 2021, Tech Science Press. All rights reserved.
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Distance metrics	en_US
dc.subject	Euclidean distance	en_US
dc.subject	K-means clustering	en_US
dc.subject	Manhattan distance
dc.subject	Minkowski distance
dc.title	Performances of k-means clustering algorithm with different distance metrics	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Access Instruction 440.pdf
Size:: 56.38 KB
Format:: Adobe Portable Document Format
Description:

Download

Name:: 440.pdf
Size:: 631.95 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

International Business