This paper addresses the problem of spatio-temporal segmentation of video s
equences. An initial intensity segmentation method (watershed segmentation)
provides a number of initial segments which are subsequently labeled, with
a known number of labels, according to motion information. The label field
is modeled as a Markov Random Field where the statistical spatial and temp
oral interactions are expressed on the basis of the initial watershed segme
nts. The labeling criterion is the maximization of the conditional a poster
iori probability of the label field given the motion hypotheses, the estima
te of the label field of the previous frame, and the image intensities. For
the optimization, an iterative motion estimation-labeling algorithm is pro
posed and experimental results are presented.