RARE: mining colossal closed itemset in high dimensional data

The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘highdimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns...

Full description

Bibliographic Details
Main Authors: Md Zaki, Fatimah Audah, Zulkurnain, Nurul Fariza
Format: Article
Language:English
English
English
Published: Elsevier B.V. 2018
Subjects:
Online Access:http://irep.iium.edu.my/65106/
http://irep.iium.edu.my/65106/
http://irep.iium.edu.my/65106/
http://irep.iium.edu.my/65106/24/65106_RARE-%20mining%20colossal%20closed.pdf
http://irep.iium.edu.my/65106/13/65106_Mining%20colossal%20closed%20itemset%20in%20high%20dimensional%20data_SCOPUS.pdf
http://irep.iium.edu.my/65106/18/65106%20RARE_Mining%20colossal%20closed%20itemset%20WOS.pdf
Description
Summary:The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘highdimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns (or dimensions). Among the vast data mining tasks, association rules have been extensively employed so as to describe the correlations between the variables found in a dataset. The task of mining association rules highly relies on the efficiency of the algorithms to extract all frequent itemsets that exist in the database. The focus towards improving run time and memory consumption of algorithms is strongly influenced by search strategies, effective pruning strategies, and the method of closure checking. Neither depth- nor breadth-first search may exert any variance without these techniques, mainly because the search space appears similar. With that, this paper investigated the strategies implemented in both row and column enumerationbased algorithms, hence proposing the RARE; a breadth-first bottom-up row-enumeration algorithm, in mining colossal closed itemsets in high-dimensional data