This paper reviews the role of information theory in characterizing the fun
damental limits of watermarking systems and in guiding the development of o
ptimal watermark embedding algorithms and optimal attacks. Watermarking can
be viewed as a communication problem with side information (in the form of
the host signal and/or a cryptographic key) available at the encoder and t
he decoder. The problem is mathematically defined by distortion constraints
, by statistical models for the host signal, and by the information availab
le in the game between the information hider, the attacker, and the decoder
. In particular, information theory explains why the performance of waterma
rk decoders that do not have access to the host signal may surprisingly be
as good as the performance of decoders that know the host signal. The theor
y is illustrated with several examples, including an application to image w
atermarking. Capacity expressions are derived under a parallel-Gaussian mod
el for the host-image source. Sparsity is the single most important propert
y of the source that determines capacity. (C) 2001 Elsevier Science B.V. Al
l rights reserved.