Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection

Citation
R. Gras et al., Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection, ELECTROPHOR, 20(18), 1999, pp. 3535-3550
Citations number
30
Categorie Soggetti
Chemistry & Analysis
Journal title
ELECTROPHORESIS
ISSN journal
01730835 → ACNP
Volume
20
Issue
18
Year of publication
1999
Pages
3535 - 3550
Database
ISI
SICI code
0173-0835(199912)20:18<3535:IPIFPM>2.0.ZU;2-F
Abstract
We have developed a new algorithm to identify proteins by means of peptide mass fingerprinting. Starting from the matrix-assisted laser desorption/ion ization-time-of-flight (MALDI-TOF) spectra and environmental data such as s pecies, isoelectric point and molecular weight, as well as chemical modific ations or number of missed cleavages of a protein, the program performs a f ully automated identification of the protein. The first step is a peak dete ction algorithm, which allows precise and fast determination of peptide mas ses, even if the peaks are of low intensity or they overlap. In the second step the masses and environmental data are used by the identification algor ithm to search in protein sequence databases (SWISS-PROT and/or TrEMBL) for protein entries that match the input data. Consequently, a list of candida te proteins is selected from the database, and a score calculation provides a ranking according to the quality of the match. To define the most discri minating scoring calculation we analyzed the respective role of each parame ter in two directions. The first one is based on filtering and exploratory effects, while the second direction focuses on the levels where the paramet ers intervene in the identification process. Thus, according to our analysi s, all input parameters contribute to the score, however with different wei ghts. Since it is difficult to estimate the weights in advance, they have b een computed with a generic algorithm, using a training set of 91 protein s pectra with their environmental data. We tested the resulting scoring calcu lation on a test set of ten proteins and compared the identification result s with those of other peptide mass fingerprinting programs.