D-SKY: a framework for processing skyline queries in a dynamic and incomplete database

Processing skyline queries in incomplete data is challenging, particularly, for a database with dynamic contents in which the database is frequently updated. These update operations not only affect the skyline computation, but also influence the skyline results. Furthermore, the incompleteness of da...

Full description

Bibliographic Details
Main Authors: Gulzar, Yonis, Alwan, Ali Amer, Ibrahim, Hamidah, Xin, Qin
Format: Conference or Workshop Item
Language:English
English
Published: ACM 2018
Subjects:
Online Access:http://irep.iium.edu.my/69867/
http://irep.iium.edu.my/69867/
http://irep.iium.edu.my/69867/
http://irep.iium.edu.my/69867/13/69867%20D-SKY_A%20Framework%20for%20Processing%20Skyline%20Queries.pdf
http://irep.iium.edu.my/69867/7/69867_D%20sky_SCOPUS.pdf
Description
Summary:Processing skyline queries in incomplete data is challenging, particularly, for a database with dynamic contents in which the database is frequently updated. These update operations not only affect the skyline computation, but also influence the skyline results. Furthermore, the incompleteness of data raises the issue of losing transitivity property of skyline technique, which leads to the problem of cyclic dominance. It is undesirable to process skyline queries on a dynamic and incomplete database by directly applying skyline process over the entire updated database due to the prohibitive cost. Thus, this paper proposes a framework called D-SKY for processing skyline queries in a dynamic and incomplete database. D-SKY aims at avoiding scanning the whole database after an update operation is performed to identify the new skylines. In this paper, we consider the case of an insert operation, in which database is updated by adding new data items. D-SKY framework exploits the existing skylines to identify the newly added dominated data items before applying the skyline process. Therefore, a large amount of dominated data items is pruned, which reduces the number of domination tests to be conducted and helps in avoiding scanning the whole data after an update operation is performed. Experiment result conducted on both real and synthetic datasets demonstrates that our solution outperforms the existing solutions in terms of reducing the number of pairwise comparisons and processing time.