Several data libraries have been created to organize all the data obta
ined worldwide about the Escherichia coli genome. Because the known da
ta now amount to more than 40% of the whole genome sequence, it has be
come necessary to organize the data in such a way that appropriate pro
cedures can associate knowledge produced by experiments about each gen
e to its position on the chromosome and its relation to other relevant
genes, for example. In addition, global properties of genes, affected
by the introduction of new entries, should be present as appropriate
description fields. A data base, implemented on Macintosh by using the
data base management system 4th Dimension, is described. It is constr
ucted around a core constituted by known contigs of E. coli sequences
and links data collected in general libraries (unmodified) to data ass
ociated with evolving knowledge (with modifiable fields). Biologically
significant results obtained through the coupling of appropriate proc
edures (learning or statistical data analysis) are presented. The data
base is available through a 4th Dimension runtime and through FTP on
Internet. It has been regularly updated and will be systematically lin
ked to other E. coli data bases (M. Kroger, R. Wahl, G. Schachtel, and
P. Rice, Nucleic Acids Res. 20(Suppl.):2119-2144, 1992; K. E, Rudd, W
. Miller, C. Werner, J. Ostell, C. Tolstoshev, and S. G. Satterfield,
Nucleic Acids Res. 19:637-647, 1991) in the near future.