A Machine Learning Approach to Identifying Database Sessions Using Unlabeled Data
In this paper, we describe a novel co-training based algorithm for identifying database user sessions from database traces. The algorithm learns to identify positive data (session boundaries) and negative data (non-session boundaries) incrementally by using two methods interactively in several itera...
Saved in:
Published in | Data Warehousing and Knowledge Discovery pp. 254 - 264 |
---|---|
Main Authors | , , |
Format | Book Chapter Conference Proceeding |
Language | English |
Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
2005
Springer |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we describe a novel co-training based algorithm for identifying database user sessions from database traces. The algorithm learns to identify positive data (session boundaries) and negative data (non-session boundaries) incrementally by using two methods interactively in several iterations. In each iteration, previous identified positive and negative data are used to build better models, which in turn can label some new data and improve performance of further iterations. We also present experimental results. |
---|---|
ISBN: | 354028558X 9783540285588 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/11546849_25 |