Given a finite collection of strings of letters from a fixed alphabet,
it is of interest, in the contexts of data compression and DNA sequen
cing, to find the length of the shortest string which contains each of
the given strings as a consecutive substring. In order to analyze the
average behavior of the optimal superstring length, substrings of spe
cified lengths are considered with the letters selected independently
at random. An asymptotic expression is obtained for the savings from c
ompression, i.e. the difference between the uncompressed (concatenated
) length and the optimal superstring length.