ITA
ENG

USE OF CAPTIONS AND OTHER COLLATERAL TEXT IN UNDERSTANDING PHOTOGRAPHS

Authors

SRIHARI RK

Citation

Rk. Srihari, USE OF CAPTIONS AND OTHER COLLATERAL TEXT IN UNDERSTANDING PHOTOGRAPHS, Artificial intelligence review, 8(5-6), 1995, pp. 409-430

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Computer Science Artificial Intelligence

Journal title

Artificial intelligence review → ACNP

ISSN journal

02692821

Volume

Issue

5-6

Year of publication

1995

Pages

409 - 430

Database

ISI

SICI code

0269-2821(1995)8:5-6<409:UOCAOC>2.0.ZU;2-Q

Abstract

This research explores the interaction of textual and photographic inf ormation in image understanding. Specifically, it presents a computati onal model whereby textual captions are used as collateral information in the interpretation of the corresponding photographs. The final und erstanding of the picture and caption reflects a consolidation of the information obtained from each of the two sources and can thus be used in intelligent information retrieval tasks. The problem of building a general-purpose computer vision system without a priori knowledge is very difficult at best. The concept of using collateral information in scene understanding has been explored in systems that use general sce ne context in the task of object identification. The work described he re extends this notion by incorporating picture specific information. A multi-stage system FICTION which uses captions to identify humans in an accompanying photograph is described. This provides a computationa lly less expensive alternative to traditional methods of face recognit ion. A key component of the system is the utilisation of spatial and c haracteristic constraints (derived from the caption) in labeling face candidates (generated by a face locator).