Similarity Search over Personal Process Description Graph

People are involved in various processes in their daily lives, such as cooking a dish, applying for a job or opening a bank account. With the advent of easy-to-use Web-based sharing platforms, many of these processes are shared as step-by-step instructions (e.g., “how-to guides” in eHow and wikiHow)...

Full description

Saved in:
Bibliographic Details
Published inWeb Information Systems Engineering - WISE 2015 Vol. 9418; pp. 522 - 538
Main Authors Hsu, Jing Ouyang, Paik, Hye-young, Zhan, Liming
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2015
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:People are involved in various processes in their daily lives, such as cooking a dish, applying for a job or opening a bank account. With the advent of easy-to-use Web-based sharing platforms, many of these processes are shared as step-by-step instructions (e.g., “how-to guides” in eHow and wikiHow) on-line in natural language form. We refer to them as personal process descriptions. In our early work, we proposed a graph-based model named Personal Process Description Graph (PPDG) to concretely represent and query the personal process descriptions. However, in practice, it is difficult to find identical personal processes or fragments for a given query due to the free-text nature of personal process descriptions. Therefore, in this paper, we propose an idea of similarity search over the “how-to guides” based on PPDG. We introduce the concept of “similar personal processes” which defines the similarity between two PPDGs by utilizing the features of both PPDG nodes and structure. Efficient and effective algorithms to process similarity search over PPDGs are developed with novel pruning techniques following a filtering-refinement framework. We present a comprehensive experimental study over both real and synthetic datasets to demonstrate the efficiency and scalability of our techniques.
ISBN:9783319261898
3319261894
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-26190-4_35