A relevant subspace based contextual outlier mining algorithm

For high-dimensional and massive data sets, a relevant subspace based contextual outlier detection algorithm is proposed. Firstly, the relevant subspace, which can effectively describe the local distribution of the various data sets, is redefined by using local sparseness of attribute dimensions. Se...

Full description

Saved in:
Bibliographic Details
Published inKnowledge-based systems Vol. 99; pp. 1 - 9
Main Authors Zhang, Jifu, Yu, Xiaolong, Li, Yonghong, Zhang, Sulan, Xun, Yaling, Qin, Xiao
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.05.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:For high-dimensional and massive data sets, a relevant subspace based contextual outlier detection algorithm is proposed. Firstly, the relevant subspace, which can effectively describe the local distribution of the various data sets, is redefined by using local sparseness of attribute dimensions. Secondly, a local outlier factor calculation formula in the relevant subspace is defined with probability density of local data sets, and the formula can effectively reflect the outlier degree of data object that does not obey the distribution of the local data set in the relevant subspace. Thirdly, attribute dimensions of constituting the relevant subspace and local outlier factor are defined as the contextual information, which can improve the interpretability and comprehensibility of outlier. Fourthly, the selection of N data objects with the greatest local outlier factor value is defined as contextual outliers. In the end, experimental results validate the effectiveness of the algorithm by using UCI data sets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2016.01.013