Finding Maximal Similar Paths Between XML Documents Using Sequential Patterns
Techniques for storing XML documents, optimizing the query, and indexing for XML have been active subjects of research. Most of these techniques are focused on XML documents shared with the same structure (i.e., the same DTD or XML Schema). However, when XML documents from the Web or EDMS (Electroni...
Saved in:
Published in | Advances in Information Systems pp. 96 - 106 |
---|---|
Main Authors | , |
Format | Book Chapter Conference Proceeding |
Language | English |
Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
01.01.2004
Springer |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
ISBN | 9783540234784 3540234780 |
ISSN | 0302-9743 1611-3349 |
DOI | 10.1007/978-3-540-30198-1_11 |
Cover
Loading…
Summary: | Techniques for storing XML documents, optimizing the query, and indexing for XML have been active subjects of research. Most of these techniques are focused on XML documents shared with the same structure (i.e., the same DTD or XML Schema). However, when XML documents from the Web or EDMS (Electronic Document Management System) are required to be merged or classified, it is very important to find the common structure among multiple documents for the process of handling documents. In this paper, we propose a new methodology for extracting common structures from XML documents and finding maximal similar paths between structures using sequential pattern mining algorithms. Correct determination of common structures between XML documents provides an important basis for a variety of applications of XML document mining and processing. Experiments with XML documents show that our adapted sequential pattern mining algorithms can find common structures and maximal similar paths between them exactly. |
---|---|
ISBN: | 9783540234784 3540234780 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-540-30198-1_11 |