The average person with a computer will soon have access to the world'
s collections of digital video and images, However, unlike text that c
an be alphabetized or numbers that can be ordered, image and video has
no general language to aid in its organization, Tools that can ''see'
' and ''understand'' the content of imagery are still in their infancy
, but they are now at the point where they can provide substantial ass
istance to users in navigating through visual media. This paper descri
bes new tools based on ''vision texture'' for modeling image and video
. The focus of this research is the use of a society of low-level mode
ls for performing relatively high-level tasks, such as retrieval and a
nnotation of image and video libraries, This paper surveys recent and
present research in this fast-growing area.