An automatic object-oriented video segmentation and representation algorith
m is proposed, where the local variance contrast and the frame difference c
ontrast are jointly exploited for meaningful moving object extraction becau
se these two visual features can indicate the spatial homogeneity of the gr
ay levels and the temporal coherence of the motion fields efficiently. The
2-D entropic thresholding technique and the watershed transformation method
are further developed to determine the global feature thresholds adaptivel
y according to the variation of the Video components. The obtained video co
mponents are first represented by a group of 4x4 blocks coarsely, and then
the meaningful moving objects are generated by an iterative region-merging
procedure according to the spatiotemporal similarity measure. The temporal
tracking procedure is further proposed to obtain more semantic moving objec
ts and to establish the correspondence of the moving objects among frames.
Therefore, the proposed automatic moving object extraction algorithm can de
tect the appearance of new objects as well as the disappearance of existing
objects efficiently because the correspondence of the video objects among
frames is also established. Moreover. an object-oriented video representati
on and indexing approach is suggested, where both the operation of the came
ra (i.e., change of the viewpoint) and the birth or death of the individual
objects are exploited to detect the breakpoints of the video data and to s
elect the key frames adaptively. (C) 2000 society of Photo-Optical Instrume
ntation Engineers. [S0091-3286(00)00102-1].