ITA
ENG

Support vector machines for spam categorization

Authors

Drucker, H Wu, DH Vapnik, VN

Citation

H. Drucker et al., Support vector machines for spam categorization, IEEE NEURAL, 10(5), 1999, pp. 1048-1054

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

IEEE TRANSACTIONS ON NEURAL NETWORKS

ISSN journal

10459227 → ACNP

Volume

Issue

Year of publication

1999

Pages

1048 - 1054

Database

ISI

SICI code

1045-9227(199909)10:5<1048:SVMFSC>2.0.ZU;2-R

Abstract

We study the use of support vector machines (SVM's) In classifying e-mail a s spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, These four algorithms were t ested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000, SVM's performed best when using binary featur es. For both data sets, boosting trees and SVM's had acceptable test perfor mance in terms of accuracy and speed. However, SVM's had significantly less training time.