We describe a real-time computer vision and machine learning system for mod
eling and recognizing human behaviors in a visual surveillance task [1]. Th
e system is particularly concerned with detecting when interactions between
people occur and classifying the type of interaction. Examples of interest
ing interaction behaviors include following another person, altering one's
path to meet another, and so forth. Our system combines top-down with botto
m-up information in a closed feedback loop, with both components employing
a statistical Bayesian approach [2]. We propose and compare two different s
tate-based learning architectures, namely, HMMs and CHMMs for modeling beha
viors and interactions. The CHMM model is shown to work much more efficient
ly and accurately. Finally, to deal with the problem of limited training da
ta, a synthetic "Alife-style" training system is used to develop flexible p
rior models for recognizing human interactions. We demonstrate the ability
to use these a priori models to accurately classify real human behaviors an
d interactions with no additional tuning or training.