USE OF CAPTIONS AND OTHER COLLATERAL TEXT IN UNDERSTANDING PHOTOGRAPHS

Authors
Citation
Rk. Srihari, USE OF CAPTIONS AND OTHER COLLATERAL TEXT IN UNDERSTANDING PHOTOGRAPHS, Artificial intelligence review, 8(5-6), 1995, pp. 409-430
Citations number
22
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Artificial Intelligence
ISSN journal
02692821
Volume
8
Issue
5-6
Year of publication
1995
Pages
409 - 430
Database
ISI
SICI code
0269-2821(1995)8:5-6<409:UOCAOC>2.0.ZU;2-Q
Abstract
This research explores the interaction of textual and photographic inf ormation in image understanding. Specifically, it presents a computati onal model whereby textual captions are used as collateral information in the interpretation of the corresponding photographs. The final und erstanding of the picture and caption reflects a consolidation of the information obtained from each of the two sources and can thus be used in intelligent information retrieval tasks. The problem of building a general-purpose computer vision system without a priori knowledge is very difficult at best. The concept of using collateral information in scene understanding has been explored in systems that use general sce ne context in the task of object identification. The work described he re extends this notion by incorporating picture specific information. A multi-stage system FICTION which uses captions to identify humans in an accompanying photograph is described. This provides a computationa lly less expensive alternative to traditional methods of face recognit ion. A key component of the system is the utilisation of spatial and c haracteristic constraints (derived from the caption) in labeling face candidates (generated by a face locator).