Discriminative frequent subgraph mining with optimality guarantees

The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article,...

Full description

Saved in:
Bibliographic Details
Published inStatistical analysis and data mining Vol. 3; no. 5; pp. 302 - 318
Main Authors Thoma, Marisa, Cheng, Hong, Gretton, Arthur, Han, Jiawei, Kriegel, Hans-Peter, Smola, Alex, Song, Le, Yu, Philip S., Yan, Xifeng, Borgwardt, Karsten M.
Format Journal Article
LanguageEnglish
Published Hoboken Wiley Subscription Services, Inc., A Wiley Company 01.10.2010
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near‐optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state‐of‐the‐art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302‐318, 2010
Bibliography:ArticleID:SAM10084
istex:F94AD66C390786E3609CD76AD7D0215EC346ABFD
ark:/67375/WNG-DDPHWP61-V
ISSN:1932-1864
1932-1872
DOI:10.1002/sam.10084