Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection
R. Gras et al., Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection, ELECTROPHOR, 20(18), 1999, pp. 3535-3550
We have developed a new algorithm to identify proteins by means of peptide
mass fingerprinting. Starting from the matrix-assisted laser desorption/ion
ization-time-of-flight (MALDI-TOF) spectra and environmental data such as s
pecies, isoelectric point and molecular weight, as well as chemical modific
ations or number of missed cleavages of a protein, the program performs a f
ully automated identification of the protein. The first step is a peak dete
ction algorithm, which allows precise and fast determination of peptide mas
ses, even if the peaks are of low intensity or they overlap. In the second
step the masses and environmental data are used by the identification algor
ithm to search in protein sequence databases (SWISS-PROT and/or TrEMBL) for
protein entries that match the input data. Consequently, a list of candida
te proteins is selected from the database, and a score calculation provides
a ranking according to the quality of the match. To define the most discri
minating scoring calculation we analyzed the respective role of each parame
ter in two directions. The first one is based on filtering and exploratory
effects, while the second direction focuses on the levels where the paramet
ers intervene in the identification process. Thus, according to our analysi
s, all input parameters contribute to the score, however with different wei
ghts. Since it is difficult to estimate the weights in advance, they have b
een computed with a generic algorithm, using a training set of 91 protein s
pectra with their environmental data. We tested the resulting scoring calcu
lation on a test set of ten proteins and compared the identification result
s with those of other peptide mass fingerprinting programs.