Recognition of transcription regulatory sites is one of the most difficult
problems of computational molecular biology. Small sample size and weak con
servativity of signals in most cases do not allow for construction of relia
ble recognition rules. Here we suggest a new approach to this problem based
on simultaneous analysis of several related genomes. Therewith we assume t
hat groups of coregulated genes are evolutionarily stable. Thus we choose i
n each genome a set of genes having strong candidate sites in regulatory re
gions and then select families of related genes. By the assumption, these f
amilies are subject to the considered regulation. This approach was applied
to analysis of purine regulons in Escherichia coil and Haemophilus influen
zae. Transcription of these genes is regulated by PurR. We have identified
PurR binding sites in the genome of H. influenzae and found a new family of
E. coli and H. influenzae transport proteins regulated by PurR.