Content-based image retrieval is based on the idea of extracting visual fea
tures from image and using them to index images in a database. The comparis
ons that determine similarity between images depend on the representations
of the features and the definition of appropriate distance function. Most o
f the research literature uses vectors as the predominate representation gi
ven the rich theory of vector spaces. While vectors are an extremely useful
representation, their use in large databases may be prohibitive given thei
r usually large dimensions and similarity functions. In this paper, we prop
ose similarity measures and an indexing algorithm based on information theo
ry that permits an image to be represented as a single number. When use in
conjunction with vectors, our method displays improved efficiency when quer
ying large databases.