ProtEST: protein multiple sequence alignments from expressed sequence tags

Citation
Ja. Cuff et al., ProtEST: protein multiple sequence alignments from expressed sequence tags, BIOINFORMAT, 16(2), 2000, pp. 111-116
Citations number
25
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
2
Year of publication
2000
Pages
111 - 116
Database
ISI
SICI code
1367-4803(200002)16:2<111:PPMSAF>2.0.ZU;2-G
Abstract
Motivation: An automatic sequence searching method (ProtEST) is described w hich constructs multiple protein sequence alignments from protein sequences and translated expressed sequence tags (ESTs). ProtEST is more effective t han a simple TBLASTN search of the query against the EST database, as the s equences are automatically clustered, assembled, made non-redundant, checke d for sequence errors, translated into protein and then aligned and display ed. Results: A ProtEST search found a non-redundant, translated error- and leng th-corrected EST sequence for >58% of sequences when single sequences from 1407 Pfam-A seed alignments were used as the probe. The average family size of the resulting alignments of translated EST sequences contained >10 sequ ences. In a cross-validated test of protein secondary structure prediction, alignments from the new procedure led to an improvement of 3.4% average Q( 3) prediction accuracy over single sequences.