Fisher's linear discriminant analysis (LDA) is a popular data-analytic
tool for studying the relationship between a set of predictors and a
categorical response. In this paper we describe a penalized version of
LDA. It is designed for situations in which there are many highly cor
related predictors, such as those obtained by discretizing a function,
or the grey-scale values of the pixels in a series of images. In case
s such as these it is natural, efficient and sometimes essential to im
pose a spatial smoothness constraint on the coefficients, both for imp
roved prediction performance and interpretability. We cast the classif
ication problem into a regression framework via optimal scoring. Using
this, our proposal facilitates the use of any penalized regression te
chnique in the classification setting. The technique is illustrated wi
th examples in speech recognition and handwritten character recognitio
n.