A heuristic approximation to the score distribution of gapped alignments in
the logarithmic domain is presented. The method applies to comparisons bet
ween random, unrelated protein sequences, using standard score matrices and
arbitrary gap penalties. It is shown that gapped alignment behavior is ess
entially governed by a single parameter, alpha, depending on the penalty sc
heme and sequence composition. This treatment also predicts the position of
the transition point between logarithmic and linear behavior. The approxim
ation is tested by simulation and shown to be accurate over a range of comm
only used substitution matrices and gap-penalties.