The 2,160,837-base pair genome sequence of an isolate of Streptococcus pneu
moniae, a Cram-positive pathogen that causes pneumonia, bacteremia, meningi
tis, and otitis media, contains 2236 predicted coding regions; of these, 14
40 (64%) were assigned a biological role. Approximately 5% of the genome is
composed of insertion sequences that may contribute to genome rearrangemen
ts through uptake of foreign DNA. Extracellular enzyme systems for the meta
bolism of polysaccharides and hexosamines provide a substantial source of c
arbon and nitrogen for S. pneumoniae and also damage host tissues and facil
itate colonization. A motif identified within the signal peptide of protein
s is potentially involved in targeting these proteins to the cell surface o
f low-guanine/cytosine (GC) Gram-positive species. Several surface-exposed
proteins that may serve as potential vaccine candidates were identified. Co
mparative genome hybridization with DNA arrays revealed strain differences
in S. pneumoniae that could contribute to differences in virulence and anti
genicity.