Support vector machines for spam categorization

Citation
H. Drucker et al., Support vector machines for spam categorization, IEEE NEURAL, 10(5), 1999, pp. 1048-1054
Citations number
18
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON NEURAL NETWORKS
ISSN journal
10459227 → ACNP
Volume
10
Issue
5
Year of publication
1999
Pages
1048 - 1054
Database
ISI
SICI code
1045-9227(199909)10:5<1048:SVMFSC>2.0.ZU;2-R
Abstract
We study the use of support vector machines (SVM's) In classifying e-mail a s spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, These four algorithms were t ested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000, SVM's performed best when using binary featur es. For both data sets, boosting trees and SVM's had acceptable test perfor mance in terms of accuracy and speed. However, SVM's had significantly less training time.