We have analysed the complete sequence of the Escherichia coli K12 isolate
MG1655 genome for chromatin-associated protein binding sites, and compared
the predicted location of predicted sites with experimental expression data
from 'DNA chip' experiments. Of the dozen proteins associated with chromat
in in E. coli, only three have been shown to have significant binding prefe
rences: integration host factor (IHF) has the strongest binding site prefer
ence, and FIS sites show a weak consensus, and there is no clear consensus
site for binding of the H-NS protein. Using hidden Markov models (HMMs), we
predict the location of 608 IHF sites, scattered throughout the genome. A
subset of the IHF sites associated with repeats tends to be clustered aroun
d the origin of replication. We estimate there could be roughly 6000 FIS si
tes in E. coli, and the sites tend to be localised in two regions flanking
the replication termini. We also show that the regions upstream of genes re
gulated by H-NS are more curved and have a higher AT content than regions u
pstream of other genes. These regions in general would also be localised ne
ar the replication terminus. (C) 2001 Societe francaise de biochimie et bio
logie moleculaire / Editions scientifiques et medicales Elsevier SAS.