The computation of channel capacity with side information at the transmitte
r side (but not at the receiver side) requires, in general, extension of th
e input alphabet to a space of "strategies," and is often hard. We consider
the special case of a discrete memoryless module-additive noise channel Y
= X + Z(S), where the encoder observes causally the random state S is an el
ement of S that governs the distribution of the noise Z(S). We show that th
e capacity of this channel is given by
C = log \X\ - min(t:S-->X) H(Z(S) - t(S))
This capacity is realized by a state-independent code, follovved bg a shift
by the "noise prediction" t(min)(S) that minimizes the entropy of Z(S) - t
(S). If the set of conditional noise distributions {p(z \ s), s is an eleme
nt of S} is such that the optimum predictor t(min)(.) is independent of the
state weights, then C is also the capacity for a noncausal encoder, that o
bserves the entire state sequence in advance. Furthermore, for this case we
also derive a simple formula for the capacity when the state process has m
emory.