Training filters for detecting spasm based on IP addresses and text-related features

The subject invention provides for an intelligent quarantining system and method that facilitates detecting and preventing spam. In particular, the invention employs a machine learning filter specifically trained using origination features such as an IP address as well as destination feature such as...

Full description

Saved in:
Bibliographic Details
Main Authors Goodman, Joshua T, Rounthwaite, Robert L, Hulten, Geoffrey J, Yih, Wen-tau
Format Patent
LanguageEnglish
Published 09.12.2008
Online AccessGet full text

Cover

Loading…
More Information
Summary:The subject invention provides for an intelligent quarantining system and method that facilitates detecting and preventing spam. In particular, the invention employs a machine learning filter specifically trained using origination features such as an IP address as well as destination feature such as a URL. Moreover, the system and method involve training a plurality of filters using specific feature data for each filter. The filters are trained independently each other, thus one feature may not unduly influence another feature in determining whether a message is spam. Because multiple filters are trained and available to scan messages either individually or in combination (at least two filters), the filtering or spam detection process can be generalized to new messages having slightly modified features (e.g., IP address). The invention also involves locating the appropriate IP addresses or URLs in a message as well as guiding filters to weigh origination or destination features more than text-based features.