An approach is described for genomic database searching based on experiment
ally observed proteolytic fragments, e.g., isolated from 1D or 2D gels or a
nalyzed directly, that can be applied to unfinished prokaryotic genomic dat
a in the absence of annotations or previously assigned open reading frames
(ORFs). This variation on the database search is in contrast to the more fa
miliar use of peptide mass spectral fragmentation data to search fully anno
tated inferred protein databases, e.g., OWL or SWISS-PROT. We compared the
SEQUEST search results from a six reading frame translation of the Porphyro
monas gingivalis genome DNA sequence with those from computationally derive
d ORFs created using publicly available genomics software tools. The ORF ap
proach eliminated many of the artifacts present in output from the six read
ing frame search. The method was applied to uninterpreted tandem mass spect
rometric data derived from proteins secreted by the periodontal pathogen Po
rphyromonas gingivalis in response to the gingival epithelial cell environm
ent, a model system for the study of host-pathogen interactions relevant to
human periodontal disease.