Bayesian statistical analysis for spams

This paper presents a Bayesian statistical analysis applied to the spam problem. In most anti-spam related research, generally it is assumed that the probability of a spam occurrence is equal to 0.5, which is in our opinion unrealistic. It is also assumed that in the spam message, words are consider...

Full description

Saved in:
Bibliographic Details
Published inIEEE Local Computer Network Conference pp. 989 - 992
Main Authors Begriche, Y, Serhrouchni, A
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2010
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper presents a Bayesian statistical analysis applied to the spam problem. In most anti-spam related research, generally it is assumed that the probability of a spam occurrence is equal to 0.5, which is in our opinion unrealistic. It is also assumed that in the spam message, words are considered as an independent family of words. This makes us look at how the posterior probability behaves when the a priori probability is different from 0.5 and derive the consequences of the assumption of independent words on the posterior probability. The first assumption pushes us to define a prior and find a posterior probability laws to enhance the spam detection and increase the reliability decision. This analysis differs from previous results, that used the Bayesian approach to the anti-spam issue, especially through refinement and enhancement of various probability laws.
ISBN:1424483875
9781424483877
ISSN:0742-1303
DOI:10.1109/LCN.2010.5735846