Recognition accuracy has been the primary objective of most speech rec
ognition research, and impressive results have been obtained, e.g. les
s than 0.3% word error rate on a speaker-independent digit recognition
task. When it comes to real-world applications, robustness and real-t
ime response might be more important issues. For the first requirement
we review some of the work on robustness and discuss one specific tec
hnique, spectral normalization, in more detail. The requirement of rea
l-time response has to be considered in the light of the limited hardw
are resources in voice control applications, which are due to the tigh
t cost constraints. In this paper we discuss in detail one specific me
ans to reduce the processing and memory demands: a clustering techniqu
e applied at various levels within the acoustic modelling.