Motivation Dynamic programming is the core algorithm of sequence comparison
, alignment and linear hidden Markov model (HMM) training. For a pair of se
quence lengths m and n, the problem cart be solved readily in O (mn) time a
nd O (mn) space. The checkpoint algorithm introduced by Grice et al. (CABIO
S, 13, 45-53, 1997) runs in O(Lmn) time and O (Lm L rootn) space, where L i
s a positive integer determined by m, n, and the amount of available worksp
ace. The algorithm is appropriate for many string comparison problems, incl
uding all-paths and single-best-path hidden Markov model training, and is r
eadily parallelizable, The checkpoint algorithm has a diagonal version that
can solve the single-best-path alignment problem in O (mn) time and O (m n) space.
Results: In this work, Mle improve performance by analyzlizg optimal checkp
oint placement The improved row checkpoint algorithm performs up to one hal
f the computation of the original algorithm. The improved diagonal checkpoi
nt algorithm performs up to 35% fever computational steps than the original
.
We modified the SAM hidden Markov modeling package to use the improved row
checkpoint algorithm. For a fixed sequence length, the new version is up to
33% faster for all-paths and 56% faster for single-best-path HMM training,
depending on sequence length and allocated memory. Over a typical set of p
rotein sequence lengths, the improvement is similar to 10%.