Database search in tandem mass spectrometry is a powerful tool for protein
identification. High-throughput spectral acquisition raises the problem of
dealing with genetic variation and peptide modifications within a populatio
n of related proteins, A method that cross-correlates and clusters related
spectra in large collections of uncharacterized spectra (i.e,, from normal
and diseased individuals) would be very valuable in functional proteomics,
This problem is far from being simple since very similar peptides may have
very different spectra, We introduce a new notion of spectral similarity th
at allows one to identify related spectra even if the corresponding peptide
s have multiple modifications/mutations. Based on this notion, we developed
a new algorithm for mutation-tolerant database search as well as a method
for cross-correlating related uncharacterized spectra.