Unsupervised binary feature construction method for networked data

•A novel feature construction algorithm is proposed for networked data.•Attributes are reconstructed by exploiting structural data of network objects.•An iterative local attribute selection method is applied for each object.•Our method simulates the attribute space of objects in the same group.•Our...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 121; pp. 256 - 265
Main Authors Kakisim, Arzu Gorgulu, Sogukpinar, Ibrahim
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 01.05.2019
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A novel feature construction algorithm is proposed for networked data.•Attributes are reconstructed by exploiting structural data of network objects.•An iterative local attribute selection method is applied for each object.•Our method simulates the attribute space of objects in the same group.•Our method can be used as pre-processing step by other methods. Networked data is data composed of network objects and links. Network objects are characterized by high dimensional attributes and by links indicating the relationships among these objects. However, traditional feature selection and feature extraction methods consider only attribute information, thus ignoring link information. In the presented work, we propose a new unsupervised binary feature construction method (NetBFC) for networked data that reconstructs attributes for each object by exploiting link information. By exploring similar objects in the network and associating them, our method increases the similarities between objects with high probability of being in the same group. The proposed method enables local attribute enrichment and local attribute selection for each object by aggregating the attributes of similar objects in order to deal with the sparsity of networked data. In addition, this method applies an attribute elimination phase to eliminate irrelevant and redundant attributes which decrease the performance of clustering algorithms. Experimental results on real-world data sets indicate that NetBFC significantly achieves better performance when compared to baseline methods.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2018.12.030