A buffer-based online clustering for evolving data stream

Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hyb...

Full description

Bibliographic Details
Main Authors: Islam, Md. Kamrul, Ahmed, Md. Manjur, Kamal Z., Zamli
Format: Article
Language:English
Published: Elsevier Ltd 2019
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf
id ump-24676
recordtype eprints
spelling ump-246762019-04-02T07:34:52Z http://umpir.ump.edu.my/id/eprint/24676/ A buffer-based online clustering for evolving data stream Islam, Md. Kamrul Ahmed, Md. Manjur Kamal Z., Zamli QA75 Electronic computers. Computer science Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just. Elsevier Ltd 2019 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf Islam, Md. Kamrul and Ahmed, Md. Manjur and Kamal Z., Zamli (2019) A buffer-based online clustering for evolving data stream. Information Sciences, 489. pp. 113-135. ISSN 0020-0255 https://doi.org/10.1016/j.ins.2019.03.022 https://doi.org/10.1016/j.ins.2019.03.022
repository_type Digital Repository
institution_category Local University
institution Universiti Malaysia Pahang
building UMP Institutional Repository
collection Online Access
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
A buffer-based online clustering for evolving data stream
description Data stream clustering plays an important role in data stream mining for knowledge extraction. Numerous researchers have recently studied density-based clustering algorithms due to their capability to generate arbitrarily shaped clusters. However, most of the algorithms are either fully offline, hybrid online/offline, or cannot handle the property of evolving data stream. Recently, a fully online clustering algorithm for evolving data stream called CEDAS was proposed. However, similar to other density-based clustering algorithms, CEDAS requires predefining the global optimal radius of micro-clusters, which is a difficult task; in addition, an erroneous choice deteriorates cluster performance. Moreover, the algorithm ignores the presence of temporarily irrelevant micro-clusters, which may be relevant in the future. In this study, we present a fully online density-based clustering algorithm called buffer-based online clustering for evolving data stream (BOCEDS). This algorithm recursively updates the micro-cluster radius to its local optimal. It also introduces a buffer for storing irrelevant micro-clusters and a fully online pruning method for extracting the temporarily irrelevant micro-cluster from the buffer. In addition, BOCEDS proposes an online micro-cluster energy-updating function based on the spatial information of the data stream. Experimental results are compared with those of CEDAS and other alternative hybrid online/offline density-based clustering algorithms, and BOCEDS proves its superiority over the other clustering algorithms. The sensitivity of clustering parameters is also measured. The proposed algorithm is then applied to real-world weather data streams to demonstrate its capability to detect changes in data stream and discover arbitrarily shaped clusters. The proposed BOCEDS can be available in https://sites.google.com/view/md-manjur-ahmed and https://sites.google.com/view/kamrul-just.
format Article
author Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
author_facet Islam, Md. Kamrul
Ahmed, Md. Manjur
Kamal Z., Zamli
author_sort Islam, Md. Kamrul
title A buffer-based online clustering for evolving data stream
title_short A buffer-based online clustering for evolving data stream
title_full A buffer-based online clustering for evolving data stream
title_fullStr A buffer-based online clustering for evolving data stream
title_full_unstemmed A buffer-based online clustering for evolving data stream
title_sort buffer-based online clustering for evolving data stream
publisher Elsevier Ltd
publishDate 2019
url http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/
http://umpir.ump.edu.my/id/eprint/24676/1/A%20buffer-based%20online%20clustering%20for%20evolving%20data%20stream.pdf
first_indexed 2023-09-18T22:37:29Z
last_indexed 2023-09-18T22:37:29Z
_version_ 1777416693084061696