Fj. Pineda et al., Testing the significance of microorganism identification by mass spectrometry and proteome database search, ANALYT CHEM, 72(16), 2000, pp. 3739-3744
We derive and validate a simple statistical model that predicts the distrib
ution of false matches between peaks in matrix-assisted laser desorption/io
nization mass spectrometry data and proteins in proteome databases. The mod
el allows us to calculate the significance of previously reported microorga
nism identification results. In particular, for Delta m = +/-1.5 Da, we fin
d that the computed significance levels are sufficient to demonstrate the a
bility to identify microorganisms, provided the number of candidate microor
ganisms is limited to roughly three Escherichia coli-like or roughly 10 Bac
illus subtilis-like microorganisms (in the sense of having roughly the same
number of proteins per unit-mass interval). We conclude that, given the cl
uttered and incomplete nature of the data, it is likely that neither simple
ranking nor simple hypothesis testing will be sufficient for truly robust
microorganism identification over a large number of candidate microorganism
s.