A Biased Support Vector Machine Approach to Web Filtering

Web filtering is an inductive process which automatically builds a filter by learning the description of user interest from a set of pre-assigned web pages, and uses the filter to assign unprocessed web pages. In web filtering, content similarity analysis is the core problem, the automatic-learning...

Full description

Saved in:

Bibliographic Details
Published in	Lecture notes in computer science pp. 363 - 370
Main Authors	Du, A-Ning, Fang, Bin-Xing, Li, Bin
Format	Book Chapter Conference Proceeding
Language	English
Published	Berlin, Heidelberg Springer Berlin Heidelberg 2005 Springer
Series	Lecture Notes in Computer Science
Subjects	Applied sciences Computer science; control theory; systems Data processing. List processing. Character string processing Exact sciences and technology Inductive Construction Machine Learning Algorithm Memory organisation. Data processing Software Statistical Learn Theory Support Vector Machine User Interest Statistical analysis Filtering Similarity Pattern recognition Data mining Filter World wide web Distribution function Vector support machine Internet Automatic analysis Learning algorithm Artificial intelligence User behavior Content analysis Biased estimation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Web filtering is an inductive process which automatically builds a filter by learning the description of user interest from a set of pre-assigned web pages, and uses the filter to assign unprocessed web pages. In web filtering, content similarity analysis is the core problem, the automatic-learning and relativity-analysis abilities of machine learning algorithms help solve the above problems and make ML useful in web filtering. While in practical applications, different filtering task implies different userinterest and thus implies different filtering result. This work studies how to adjust the web filtering results to be more fit for the user interest. The web filtering result are divided into three categories: relative pages, similar pages and homologous pages according to different user interest. A Biased Support Vector Machine (BSVM) algorithm, which imports a stimulant function, uses training examples distribution n + /n− − and a user-adaptable parameter k to deal imbalancedly different classes of the pre-assigned pages, is introduced to adjust the filtering result to be best fit for the user interest. Experiments show that BSVM can greatly improve the web filtering performance.
ISBN:	3540287574 9783540287575
ISSN:	0302-9743 1611-3349
DOI:	10.1007/11551188_39