Skyline Preference Query Based on Massive and Incomplete Dataset

Personalized recommendation and the processing of real-time data exemplify the processing of massive data which in the field of Internet-of-Things (IoT) received a great extent of attention in recent literature. The incompleteness of massive data in the IoT is widespread. Obtaining personalized info...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 5; pp. 3183 - 3192
Main Authors	Wang, Yan, Shi, Zhan, Wang, Junlu, Sun, Lingfeng, Song, Baoyan
Format	Journal Article
Language	English
Published	Piscataway IEEE 2017 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Clustering Clustering algorithms Customization Cyber-physical systems Data mining Data processing Datasets Decision analysis Efficiency Encoding Entropy (Information theory) incomplete data processing Information entropy Internet of Things Multiple objective analysis Queries Query processing Response time (computers) Scientific visualization skyline query Strategy
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Personalized recommendation and the processing of real-time data exemplify the processing of massive data which in the field of Internet-of-Things (IoT) received a great extent of attention in recent literature. The incompleteness of massive data in the IoT is widespread. Obtaining personalized information from the incomplete data set is still puzzled by searching efficient and accurate methods at present. Skyline query is a widely used data processing method, especially in the field of multi-objective decision analysis and data visualization. To eliminate the negative effects on massive data processing in IoT, a novel skyline preference query strategy based on massive and the incomplete data set is proposed in this paper. This strategy simply separates and divides massive and incomplete data set into two parts according to dimension importance and executes skyline query, respectively. The strategy mainly resolves the problem of extracting personalized information from massive and incomplete data set and improves the efficiency of skyline query on massive and incomplete data set. First, this paper presents a skyline preference query strategy based on strict clustering and implements it on dimensions that have higher importance. Second, a skyline preference query strategy based on loose clustering is implemented on dimensions that have lower importance. Finally, integrating local skyline query results, this paper calculates global skyline query results by using information entropy theory. The efficiency and effectiveness of Skyline Preference Query (SPQ) algorithm have been evaluated in terms of response time and result set size through the comparative experiments with ISkyline algorithm and sort-based incomplete data skyline algorithm. A large number of simulation results show that the efficiency of SPQ algorithm is higher than that of other common methods.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2016.2639558