M. Waterman, ESTIMATING STATISTICAL SIGNIFICANCE OF SEQUENCE ALIGNMENTS, Philosophical transactions-Royal Society of London. Biological sciences, 344(1310), 1994, pp. 383-390
Algorithms that compare two proteins or DNA sequences and produce an a
lignment of the best matching segments are widely used in molecular bi
ology. These algorithms produce scores that when comparing random sequ
ences of length n grow proportional to n or to log(n) depending on the
algorithm parameters. The Azuma-Hoeffding inequality gives an upper b
ound on the probability of large deviations of the score from its mean
in the linear case. Poisson approximation can be applied in the logar
ithmic case.