I. Humpherysmith et W. Blackstock, PROTEOME ANALYSIS - GENOMICS VIA THE OUTPUT RATHER THAN THE INPUT CODE, Journal of protein chemistry, 16(5), 1997, pp. 537-544
A knowledge of the 'proteome,' total protein output encoded by a genom
e, provides information on (1) if and when predicted gene products are
translated, (2) the relative concentrations of gene products, and (3)
the extent of posttranslational modification, none of which can be ac
curately predicted from the nucleic acid sequence alone. The current s
tatus of proteome analysis is reviewed with respect to some of the tec
hniques employed, automation, relevance to genomic studies, mass spect
rometry and bioinformatics, limitations, and recent improvements in re
solution and sensitivity for the detection of protein expression in wh
ole cells, tissues, or organisms. The concept of 'proteomic contigs' i
s introduced for the first time. Traditional approaches to genomic ana
lysis call upon a number of strategies to produce contiguous DNA seque
nce information, while 'proteomic contigs' are derived from multiple m
olecular mass and isoelectric point windows in order to construct a pi
cture of the total protein expression within living cells. In higher e
ukaryotes, the latter may require several dozen image subsets of prote
in spots to be stitched together using advanced image analysis. The ut
ility of both experimental and theoretical peptide-mass fingerprinting
(PMF) and associated bioinformatics is outlined. A previously unknown
motif within the peptide sequence of Elongation Factor Tu from Thermu
s aquaticus was discovered using PMF. This motif was shown to possess
potential significance in maintaining structural integrity of the enti
re molecule.