Jr. Yates et al., MINING GENOMES - CORRELATING TANDEM MASS-SPECTRA OF MODIFIED AND UNMODIFIED PEPTIDES TO SEQUENCES IN NUCLEOTIDE DATABASES, Analytical chemistry, 67(18), 1995, pp. 3202-3210
The correlation of uninterpreted tandem mass spectra of modified and u
nmodified peptides, produced under low-energy (10-50 eV) collision con
ditions, with nucleotide sequences is demonstrated. In this method nuc
leotide databases are translated in six reading frames, and the result
ing amino acid sequences are searched ''on the fly'' to identify and f
it linear sequences to the fragmentation patterns observed in the tand
em mass spectra of peptides. A cross-correlation function is then used
to provide a measurement of similarity between the mass-to-charge rat
ios for the fragment ions predicted by amino acid sequences translated
from the nucleotide database and the fragment ions observed in the ta
ndem mass spectrum. In general, a difference greater than 0.1 between
the normalized cross-correlation functions for the first- and second-r
anked search results indicates a successful match between sequence and
spectrum. Measurements of the deviation from maximum similarity emplo
ying the spectral reconstruction method are made. The search method em
ploying nucleotide databases is also demonstrated on the spectra of ph
osphorylated peptides. Specific sites of modification are identified e
ven though no specific information relevant to sites of modification i
s contained in the character-based sequence information of nucleotide
databases.