A statistical analysis was performed to determine to what extent an am
ino acid determines the identity of its neighbors and to what extent t
his is determined by the structural environment. Log-Linear analysis w
as used to discriminate chance occurrence from statistically meaningfu
l correlations, The classification of structures was arbitrary, but wa
s also tested for significance, A list of statistically significant in
teraction types was selected and then ranked according to apparent imp
ortance for applications such as protein design. This showed that, in
general, nonlocal, through-space interactions were more important than
those between residues near in the protein sequence. The highest rank
ed nonlocal interactions involved residues in beta-sheet structures. O
f the local interactions, those between residues i and i + 2 were the
most important in both alpha-helices and beta-strands, Some surprising
ly strong correlations were discovered within beta-sheets between resi
dues and sites sequentially near to their bridging partners. The resul
ts have a clear bearing on protein engineering studies, but also have
implications for the construction of knowledge-based force fields. Pro
teins 32:175-189, 1998. (C) 1998 Wiley-Liss, Inc.