Video segmentation for content-based coding

Authors
Citation
T. Meier et Kn. Ngan, Video segmentation for content-based coding, IEEE CIR SV, 9(8), 1999, pp. 1190-1203
Citations number
26
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
ISSN journal
10518215 → ACNP
Volume
9
Issue
8
Year of publication
1999
Pages
1190 - 1203
Database
ISI
SICI code
1051-8215(199912)9:8<1190:VSFCC>2.0.ZU;2-F
Abstract
To provide multimedia applications with new functionalities, such as conten t-based interactivity and scalability, the new video coding standard MPEG-4 relies on a content-based representation. This requires a prior decomposit ion of sequences into semantically meaningful, physical objects. We formula te this problem as one of separating foreground objects from the background based on motion information. For the object of interest, a two-dimensional binary model is derived and t racked throughout the sequence, The model points consist of edge pixels det ected by the Canny operator. To accommodate rotation and changes in shape o f the hacked object, the model is updated every frame. These binary models then guide the actual video object plane (VOP) extraction. Thanks to our ne w boundary postprocessor and the excellent edge localization properties of the Canny operator, the resulting VOP contours are very accurate. Both the model initialization and update stages exploit motion information. The main assumption underlying our approach is the existence of a dominant global m otion that can be assigned to the background. Areas that do not follow this background motion indicate the presence of independently moving physical o bjects. Two alternative methods to identify such objects are presented. The first one employs a morphological motion filter with a new filter criterio n, which measures the deviation of the locally estimated optical how from t he corresponding global motion. The second method computes a change detecti on mask by taking the difference between consecutive frames. The first vers ion is more suitable for sequences with little motion, whereas the second v ersion is better at dealing with faster moving or changing objects. Experim ental results demonstrate the performance of our algorithm.