Motivation: Noise in database searches resulting from random sequence simil
arities increases as the databases expand rapidly. The noise problems are n
ot a technical shortcoming of the database search programs, but a logical c
onsequence of the idea of homology searches. The effect can be observed in
simulation experiments.
Results: We have investigated noise levels in pairwise alignment based data
base searches. The noise levels of 38 releases of the SwissProt database, d
isplay perfect logarithmic growth with the total length of the databases. C
lustering of real biological sequences reduces noise levels, but the effect
is marginal.