Comparison of the Escherichia coli K-12 genome with sampled genomes of a Klebsiella pneumoniae and three Salmonella enterica serovars, Typhimurium, Typhi and Paratyphi
M. Mcclelland et al., Comparison of the Escherichia coli K-12 genome with sampled genomes of a Klebsiella pneumoniae and three Salmonella enterica serovars, Typhimurium, Typhi and Paratyphi, NUCL ACID R, 28(24), 2000, pp. 4974-4986
The Escherichia coli K-12 genome (ECO) was compared with the sampled genome
s of the sibling species Salmonella enterica serovars Typhimurium, Typhi an
d Paratyphi A (collectively referred to as SAL) and the genome of the close
outgroup Klebsiella pneumoniae (KPN), There are at least 160 locations whe
re sequences of >400 bp are absent from ECO but present in the genomes of a
ll three SAL and 394 locations where sequences are present in ECO but close
homologs are absent in all SAL genomes, The 394 sequences in ECO that do n
ot occur in SAL contain 1350 (30.6%) of the 4405 ECO genes, Of these, 1165
are missing from both SAL and KPN, Most of the 1165 genes are concentrated
within 28 regions of 10-40 kb, which consist almost exclusively of such gen
es, Among these regions were six that included previously identified crypti
c phage, A hypothetical ancestral state of genomic regions that differ betw
een ECO and SAL can be inferred in some cases by reference to the genome st
ructure in KPN and the more distant relative Yersinia pestis, However, many
changes between ECO and SAL are concentrated in regions where all four gen
era have a different structure, The rate of gene insertion and deletion is
sufficiently high in these regions that the ancestral state of the ECO/SAL
lineage cannot be inferred from the present data, The sequencing of other c
losely related genomes, such as S.bongori or Citrobacter, may help in this
regard.