To provide multimedia applications with new functionalities, such as conten
t-based interactivity and scalability, the new video coding standard MPEG-4
relies on a content-based representation. This requires a prior decomposit
ion of sequences into semantically meaningful, physical objects. We formula
te this problem as one of separating foreground objects from the background
based on motion information.
For the object of interest, a two-dimensional binary model is derived and t
racked throughout the sequence, The model points consist of edge pixels det
ected by the Canny operator. To accommodate rotation and changes in shape o
f the hacked object, the model is updated every frame. These binary models
then guide the actual video object plane (VOP) extraction. Thanks to our ne
w boundary postprocessor and the excellent edge localization properties of
the Canny operator, the resulting VOP contours are very accurate. Both the
model initialization and update stages exploit motion information. The main
assumption underlying our approach is the existence of a dominant global m
otion that can be assigned to the background. Areas that do not follow this
background motion indicate the presence of independently moving physical o
bjects. Two alternative methods to identify such objects are presented. The
first one employs a morphological motion filter with a new filter criterio
n, which measures the deviation of the locally estimated optical how from t
he corresponding global motion. The second method computes a change detecti
on mask by taking the difference between consecutive frames. The first vers
ion is more suitable for sequences with little motion, whereas the second v
ersion is better at dealing with faster moving or changing objects. Experim
ental results demonstrate the performance of our algorithm.