Research in human/computer interaction has mainly focused on natural l
anguage, text, speech and vision primarily in isolation. Recently ther
e have been a number of research projects that have concentrated on th
e integration of such modalities using intelligent reasoners. The rati
onale is that many inherent ambiguities in single modes of communicati
on can be resolved if extra information is available. This paper descr
ibes an intelligent multi-modal system called the Smart Work Manager T
he main characteristics of the Smart Work Manager are that it can proc
ess speech, text, face images, gaze information and simulated gestures
using the mouse as input modalities, and its output is in the form of
speech, text or graphics, The main components of the system are the r
easoner, a speech system, a vision system, an integration platform and
the application interface. The overall architecture of the system wil
l be described together with the integration platform and the componen
ts of the system which include a non-intrusive neural network based ga
ze-tracking system. The paper concludes with a discussion on the appli
cability of such systems to intelligent human/computer interaction and
lessons learnt in terms of reliability and efficiency.