Correlation functions in large sets of non-homologous protein sequence
s are analysed. Finite size corrections are applied and fluctuations a
re estimated. As symbol sequences have to be mapped to sequences of nu
mbers to calculate correlation functions, several property codes are t
ested as such mappings. We found hydrophobicity autocorrelation functi
ons to be strongly oscillating. Another strong signal is the monotonou
sly decaying alpha-helix propensity autocorrelation function. Furtherm
ore, we detected signals corresponding to an alternation of positively
and negatively charged residues at a distance of 3-4 amino acids. To
look beyond the property codes gained by the methods of physical chemi
stry, mappings yielding a strong correlation signal are sought for usi
ng a Monte Carlo simulation. The mappings leading to strong signals ar
e found to be related to hydrophobicity or alpha-helix propensity. A c
luster analysis of the top scoring mappings leads to two novel propert
y codes. These two property codes are gained from sequence data only.
They turn out to be similar to known property codes for hydrophobicity
or polarity. (C) 1998 Academic Press Limited.