EVALUATION OF ALGORITHMS USED FOR CROSS-SPECIES PROTEOME CHARACTERIZATION

Citation
Sj. Cordwell et I. Humpherysmith, EVALUATION OF ALGORITHMS USED FOR CROSS-SPECIES PROTEOME CHARACTERIZATION, Electrophoresis, 18(8), 1997, pp. 1410-1417
Citations number
40
Categorie Soggetti
Biochemical Research Methods
Journal title
ISSN journal
01730835
Volume
18
Issue
8
Year of publication
1997
Pages
1410 - 1417
Database
ISI
SICI code
0173-0835(1997)18:8<1410:EOAUFC>2.0.ZU;2-O
Abstract
The ability to effectively search databases for the identification of protein spots from two-dimensional electrophoresis gels has become an essential step in the study of microbial proteomes. A variety of analy tical techniques are currently being employed during protein character isation. A number of algorithms used to search databases, accessible v ia the World Wide Web, depend upon information concerning N- and C-ter minal microsequence, amino acid composition, and peptide-mass fingerpr inting. The effectiveness of nine such algorithms, as well as COMBINED (software developed in this laboratory for identifying proteins acros s species boundaries) was examined. Fifty-four ribosomal proteins from the Mycoplasma genitalium genome, and 72 amino acyl tRNA synthetases from the Haemophilus influenzae, M. genitalium and Methanococcus janna schii genomes were chosen for study. These proteins were selected beca use they represent a wide range of sequence identities across species boundaries (22.7-100% identity), as detected by standard sequence alig nment tools. Such sequence variation allowed for a statistical compari son of algorithm success measured against published sequence identity. The ability of analytical techniques used in protein characterisation and associated database query programs to detect identity at the func tional group level was examined for proteins with low levels of homolo gy at the gene/protein sequence level. The significance of these theor etical data manipulations provided the means to predict the utility of data acquired experimentally for non-sequence-dependent software in p roteome analysis. The data obtained also predicted that 'sequence tagg ing' of peptide fingerprints would need to be accompanied by at least 11-20 residues of amino acid sequence for it to be widely used for pro tein characterisation across species boundaries.