Reliable automatic protein identification from matrix-assisted laser desorption/ionization mass spectrometric peptide fingerprints

Citation
P. Berndt et al., Reliable automatic protein identification from matrix-assisted laser desorption/ionization mass spectrometric peptide fingerprints, ELECTROPHOR, 20(18), 1999, pp. 3521-3526
Citations number
8
Categorie Soggetti
Chemistry & Analysis
Journal title
ELECTROPHORESIS
ISSN journal
01730835 → ACNP
Volume
20
Issue
18
Year of publication
1999
Pages
3521 - 3526
Database
ISI
SICI code
0173-0835(199912)20:18<3521:RAPIFM>2.0.ZU;2-F
Abstract
Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry of pr otein samples from two-dimensional (2-D) gels in conjunction with protein s equence database searches is frequently used to identify proteins. Moreover , the automatic analysis of complete 2-D gels with hundreds and even thousa nds of protein spots ("proteome analysis") is possible, without human inter vention, with the availability of highly accurate mass spectrometry instrum ents, and high-throughput facilities for preparation and handling of protei n samples from 2-D gels. However, the lack of software for precise automati c analysis and annotation of mass spectra, as well as software for in-batch sequence database queries, is increasingly becoming a significant bottlene ck for the proteomics work flow. in the present paper we outline an algorit hm for reliable, accurate, and automatic evaluation of mass spectrometric d ata and database searches. We show here that simply selecting from the sequ ence database the protein that has the most matching fragment masses often leads to false-positive results. Reliable protein identification is depende nt on several parameters: the accuracy of fragment mass determination, the number of masses submitted for query, the mass distribution of query masses , the number of masses matching between sample and database protein, the si ze of the sequence database, and the kind and number of modifications consi dered. Using these parameters, we derive a simple statistical estimation th at can be used to calculate the probability of true-positive protein identi fication.