Optimizing skyline query processing in incomplete data

Given the significance of skyline queries, they are incorporated in various modern applications including personalized recommendation systems as well as decision-making and decision-support systems. Skyline queries are used to identify superior data items in the database. Most of the previously prop...

Full description

Bibliographic Details
Main Authors: Gulzar, Yonis, Alwan, Ali Amer, Turaev, Sherzod
Format: Article
Language:English
English
Published: IEEE 2019
Subjects:
Online Access:http://irep.iium.edu.my/77291/
http://irep.iium.edu.my/77291/
http://irep.iium.edu.my/77291/
http://irep.iium.edu.my/77291/1/Optimizing%20Skyline%20Query%20Processing_Published_Version_Final.pdf
http://irep.iium.edu.my/77291/7/77291_Optimizing%20Skyline%20Query%20Processing%20in%20Incomplete%20Data_Scopus.pdf
Description
Summary:Given the significance of skyline queries, they are incorporated in various modern applications including personalized recommendation systems as well as decision-making and decision-support systems. Skyline queries are used to identify superior data items in the database. Most of the previously proposed skyline algorithms work on a complete database where the data are always present (non-missing). However, in many contemporary real-world databases, particularly those databases with large cardinality and high dimensionality, such assumption is not necessarily valid. Hence, missing data pose new challenges if the processing skyline queries cannot easily apply those methods that are designed for complete data. This is due to the fact that imperfect data cause the loss of the transitivity property of the skyline method and cyclic dominance. This paper presents a framework called Optimized Incomplete Skyline (OIS) which utilizes a technique that simplifies the skyline process on a database with missing data and helps prune the data items before performing the skyline process. The proposed strategy assures that the number of the domination tests is significantly reduced. A set of experiments has been accomplished using both real and synthetic datasets aimed at validating the performance of the framework. The experiment results confirm that the OIS framework is indeed superior and steadily outperforms the current approaches in terms of the number of domination tests required to retrieve the skylines.