AUTOMATIC VIDEO DATA STRUCTURING THROUGH SHOT PARTITIONING AND KEY-FRAME COMPUTING

Citation
W. Xiong et al., AUTOMATIC VIDEO DATA STRUCTURING THROUGH SHOT PARTITIONING AND KEY-FRAME COMPUTING, Machine vision and applications, 10(2), 1997, pp. 51-65
Citations number
23
Categorie Soggetti
Controlo Theory & Cybernetics","Computer Sciences, Special Topics","Computer Sciences","Engineering, Eletrical & Electronic","Computer Science Cybernetics
ISSN journal
09328092
Volume
10
Issue
2
Year of publication
1997
Pages
51 - 65
Database
ISI
SICI code
0932-8092(1997)10:2<51:AVDSTS>2.0.ZU;2-8
Abstract
In video processing, a common first step is to seg ment the videos int o physical units, generally called shots. A shot is a video segment th at consists of one continuous action. In general, these physical units need to be clustered to form more semantically significant units, suc h as scenes, sequences, programs, etc. This is the so-called story-bas ed video structuring. Automatic video structuring is of great importan ce for video browsing and retrieval. The shots or scenes are usually d escribed by one or several representative frames, called key-frames. V iewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of: (i) video partitioning, (ii) key frame computing, (iii) key frame pruning. For the first problem, an algorithm called '' net comparison'' is devised. It is accurate and fast because it uses b oth statistical and spatial information in an image and does not have to process the entire image. For the last two problems, we develop an original image similarity criterion, which considers both spatial layo ut and detail content in an image. For this purpose, coefficients of w avelet decomposition are used to derive parameter vectors accounting f or the above two aspects. The parameters exhibit (quasi-) invariant pr operties, thus making the algorithm robust for many types of object/ca mera motions and scaling variances. The novel ''seek and spread'' stra tegy used in key frame computing allows us to obtain a large represent ative range for the key frames. Inter-shot redundancy of the key-frame s is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques .