This paper investigates the use of sequences of system calls for classifyin
g intrusions and faults induced by privileged processes in Unix. Classifica
tion is an essential capability for responding to an anomaly (attack or fau
lt), since it gives the ability to associate appropriate responses to each
anomaly type. Previous work using the well known dataset from the Universit
y of New Mexico (UNM) has demonstrated the usefulness of monitoring sequenc
es of system calls for detecting anomalies induced by processes correspondi
ng to several Unix Programs, such as sendmail, lpr, ftp, etc. Specifically,
previous work has shown that the Anomaly Count of a running process, i.e.,
the number of sequences spawned by the process which are not found in the
corresponding dictionary of normal activity for the Program, is a valuable
feature for anomaly detection. To achieve Classification, in this paper we
introduce the concept of Anomaly Dictionaries, which are the sets of anomal
ous sequences for each type of anomaly. It is verified that Anomaly Diction
aries for the UNM's sendmail Program have very little overlap, and can be e
ffectively used for Anomaly Classification. The sequences in the Anomalous
Dictionary enable a description of Self for the Anomalies, analogous to the
definition of Self for Privileged Programs given by the Normal Dictionarie
s. The dependence of Classification Accuracy with sequence length is also d
iscussed. As a side result, it is also shown that a hybrid scheme, combinin
g the proposed classification strategy with the original Anomaly Counts can
lead to a substantial improvement in the overall detection rates for the s
endmail dataset. The methodology proposed is rather general, and can be app
lied to any situation where sequences of symbols provide an effective chara
cterization of a phenomenon.