Bayesian statistical analysis for spams

This paper presents a Bayesian statistical analysis applied to the spam problem. In most anti-spam related research, generally it is assumed that the probability of a spam occurrence is equal to 0.5, which is in our opinion unrealistic. It is also assumed that in the spam message, words are consider...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Local Computer Network Conference pp. 989 - 992
Main Authors	Begriche, Y, Serhrouchni, A
Format	Conference Proceeding
Language	English
Published	IEEE 01.10.2010
Subjects	Bayesian methods Bayesian statistical model Binomial law Classification Conditional density Distribution attachment Filtering Ham(H) Niobium Spam(S) Telecommunications Training Unsolicited electronic mail
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents a Bayesian statistical analysis applied to the spam problem. In most anti-spam related research, generally it is assumed that the probability of a spam occurrence is equal to 0.5, which is in our opinion unrealistic. It is also assumed that in the spam message, words are considered as an independent family of words. This makes us look at how the posterior probability behaves when the a priori probability is different from 0.5 and derive the consequences of the assumption of independent words on the posterior probability. The first assumption pushes us to define a prior and find a posterior probability laws to enhance the spam detection and increase the reliability decision. This analysis differs from previous results, that used the Bayesian approach to the anti-spam issue, especially through refinement and enhancement of various probability laws.
ISBN:	1424483875 9781424483877
ISSN:	0742-1303
DOI:	10.1109/LCN.2010.5735846