RARE: mining colossal closed itemset in high dimensional data
The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘highdimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English English English |
Published: |
Elsevier B.V.
2018
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/65106/ http://irep.iium.edu.my/65106/ http://irep.iium.edu.my/65106/ http://irep.iium.edu.my/65106/24/65106_RARE-%20mining%20colossal%20closed.pdf http://irep.iium.edu.my/65106/13/65106_Mining%20colossal%20closed%20itemset%20in%20high%20dimensional%20data_SCOPUS.pdf http://irep.iium.edu.my/65106/18/65106%20RARE_Mining%20colossal%20closed%20itemset%20WOS.pdf |
Summary: | The present society has been sculpted into a continuous data generator. In fact, the massive automatic data collection has generated a new genre of dataset, termed as ‘highdimensional data’, which is characterized by a relatively small number of rows, in comparison to that of large number of columns (or dimensions). Among the vast data mining tasks, association rules have been extensively employed so as to describe the correlations between the variables found in a dataset. The task of mining association rules highly relies on the efficiency of the algorithms to extract all frequent itemsets that exist in the database. The focus towards improving run time and memory consumption of algorithms is strongly influenced by search strategies, effective pruning strategies, and the method of closure checking. Neither depth- nor breadth-first search may exert any variance without these techniques, mainly because the search space appears similar. With that, this paper investigated the strategies implemented in both row and column enumerationbased algorithms, hence proposing the RARE; a breadth-first bottom-up row-enumeration algorithm, in mining colossal closed itemsets in high-dimensional data |
---|