Interesting pattern mining in multi-relational data

Mining patterns from multi-relational data is a problem attracting increasing interest within the data mining community. Traditional data mining approaches are typically developed for single-table databases, and are not directly applicable to multi-relational data. Nevertheless, multi-relational dat...

Full description

Saved in:

Bibliographic Details
Published in	Data mining and knowledge discovery Vol. 28; no. 3; pp. 808 - 849
Main Authors	Spyropoulou, Eirini, De Bie, Tijl, Boley, Mario
Format	Journal Article
Language	English
Published	Boston Springer US 01.05.2014 Springer Nature B.V
Subjects	Algorithms Artificial Intelligence Chemistry and Earth Sciences Communities Computer Science Data mining Data Mining and Knowledge Discovery Genre Information Storage and Retrieval Logic programming Pattern analysis Physics Relational data bases Representations Statistics for Engineering Syntax Tables (data) Maximum entropy modelling Pattern mining Multi-relational data mining Interestingness measures K-partite graphs
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Mining patterns from multi-relational data is a problem attracting increasing interest within the data mining community. Traditional data mining approaches are typically developed for single-table databases, and are not directly applicable to multi-relational data. Nevertheless, multi-relational data is a more truthful and therefore often also a more powerful representation of reality. Mining patterns of a suitably expressive syntax directly from this representation, is thus a research problem of great importance. In this paper we introduce a novel approach to mining patterns in multi-relational data. We propose a new syntax for multi-relational patterns as complete connected subsets of database entities. We show how this pattern syntax is generally applicable to multi-relational data, while it reduces to well-known tiles “ Geerts et al. (Proceedings of Discovery Science, pp 278–289, 2004 )” when the data is a simple binary or attribute-value table. We propose RMiner, a simple yet practically efficient divide and conquer algorithm to mine such patterns which is an instantiation of an algorithmic framework for efficiently enumerating all fixed points of a suitable closure operator “Boley et al. (Theor Comput Sci 411(3):691–700, 2010 )”. We show how the interestingness of patterns of the proposed syntax can conveniently be quantified using a general framework for quantifying subjective interestingness of patterns “De Bie (Data Min Knowl Discov 23(3):407–446, 2011b )”. Finally, we illustrate the usefulness and the general applicability of our approach by discussing results on real-world and synthetic databases.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	1384-5810 1573-756X
DOI:	10.1007/s10618-013-0319-9