Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters
K-means is an unsupervised learning and partitioning clustering algorithm. It is popular and widely used for its simplicity and fastness. K-means clustering produce a number of separate flat (non-hierarchical) clusters and suitable for generating globular clusters. The main drawback of the k-means...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IGI Global
2011
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/7/improved-parameterless-k-means_-auto-generation-centroids-and-distance-data-point-clusters%281%29.pdf |
id |
ump-9328 |
---|---|
recordtype |
eprints |
spelling |
ump-93282018-02-05T00:25:43Z http://umpir.ump.edu.my/id/eprint/9328/ Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters Wan Maseri, Wan Mohd Beg, Abul Hashem Herawan, Tutut Noraziah, Ahmad QA75 Electronic computers. Computer science K-means is an unsupervised learning and partitioning clustering algorithm. It is popular and widely used for its simplicity and fastness. K-means clustering produce a number of separate flat (non-hierarchical) clusters and suitable for generating globular clusters. The main drawback of the k-means algorithm is that the user must specify the number of clusters in advance. This paper presents an improved version of K-means algorithm with auto-generate an initial number of clusters (k) and a new approach of defining initial Centroid for effective and efficient clustering process. The underlined mechanism has been analyzed and experimented. The experimental results show that the number of iteration is reduced to 50% and the run time is lower and constantly based on maximum distance of data points, regardless of how many data points. IGI Global 2011-07 Article PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/9328/7/improved-parameterless-k-means_-auto-generation-centroids-and-distance-data-point-clusters%281%29.pdf Wan Maseri, Wan Mohd and Beg, Abul Hashem and Herawan, Tutut and Noraziah, Ahmad (2011) Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters. International Journal of Information Retrieval Research (IJIRR), 1 (3). pp. 1-14. ISSN 2155-6377 (print); 2155-6385 (online) http://www.igi-global.com/article/improved-parameterless-means/64168 10.4018/ijirr.2011070101 |
repository_type |
Digital Repository |
institution_category |
Local University |
institution |
Universiti Malaysia Pahang |
building |
UMP Institutional Repository |
collection |
Online Access |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Wan Maseri, Wan Mohd Beg, Abul Hashem Herawan, Tutut Noraziah, Ahmad Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
description |
K-means is an unsupervised learning and partitioning clustering algorithm. It is popular and widely used for
its simplicity and fastness. K-means clustering produce a number of separate flat (non-hierarchical) clusters
and suitable for generating globular clusters. The main drawback of the k-means algorithm is that the user
must specify the number of clusters in advance. This paper presents an improved version of K-means algorithm
with auto-generate an initial number of clusters (k) and a new approach of defining initial Centroid for
effective and efficient clustering process. The underlined mechanism has been analyzed and experimented.
The experimental results show that the number of iteration is reduced to 50% and the run time is lower and
constantly based on maximum distance of data points, regardless of how many data points. |
format |
Article |
author |
Wan Maseri, Wan Mohd Beg, Abul Hashem Herawan, Tutut Noraziah, Ahmad |
author_facet |
Wan Maseri, Wan Mohd Beg, Abul Hashem Herawan, Tutut Noraziah, Ahmad |
author_sort |
Wan Maseri, Wan Mohd |
title |
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
title_short |
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
title_full |
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
title_fullStr |
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
title_full_unstemmed |
Improved Parameterless K-Means: Auto-Generation Centroids and Distance Data Point Clusters |
title_sort |
improved parameterless k-means: auto-generation centroids and distance data point clusters |
publisher |
IGI Global |
publishDate |
2011 |
url |
http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/ http://umpir.ump.edu.my/id/eprint/9328/7/improved-parameterless-k-means_-auto-generation-centroids-and-distance-data-point-clusters%281%29.pdf |
first_indexed |
2023-09-18T22:07:47Z |
last_indexed |
2023-09-18T22:07:47Z |
_version_ |
1777414824720859136 |