Superstrings have many applications in data compression and genetics.
However, the decision version of the shortest superstring problem is N
P-complete. In this paper we examine the complexity of approximating s
hortest superstrings. There are two basic measures of the approximatio
ns: the length factor and the compression factor. The well known and p
ractical approximation algorithm is the sequential algorithm GREEDY. I
t approximates the shortest superstring with the compression factor of
1/2 and with the length factor of 4. Our main results are: (1) A sequ
ential length approximation algorithm which achieves a length factor o
f 2.83. This result improves the best previously known bound of 2.89 d
ue to Teng and Yao. Very recently, this bound was improved by Kosaraju
, Park, and Stein to 2.79, and by Armen and Stein to 2.75. (2) A proof
that the algorithm GREEDY is not parallelizable, the computation of i
ts output is P-complete. (3) An NC algorithm which achieves the compre
ssion factor of 1/(4 + epsilon). (4) The design of an RNC algorithm wi
th constant length factor and an NC algorithm with logarithmic length
factor. (C) 1997 Academic Press.