Content Code Blurring: A New Approach to Content Extraction
Most HTML documents on the world wide web contain far more than the article or text which forms their main content. Navigation menus, functional and design elements or commercial banners are typical examples of additional contents. Content extraction is the process of identifying the main content an...
Saved in:
Published in | 2008 19th International Workshop on Database and Expert Systems Applications pp. 29 - 33 |
---|---|
Main Author | |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2008
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!