EBEST - AN AUTOMATED TOOL USING EXPRESSED SEQUENCE TAGS TO DELINEATE GENE STRUCTURE

Authors
Citation
J. Jiang et Hj. Jacob, EBEST - AN AUTOMATED TOOL USING EXPRESSED SEQUENCE TAGS TO DELINEATE GENE STRUCTURE, PCR methods and applications, 8(3), 1998, pp. 268-275
Citations number
16
Categorie Soggetti
Biothechnology & Applied Migrobiology",Biology,"Genetics & Heredity
ISSN journal
10549803
Volume
8
Issue
3
Year of publication
1998
Pages
268 - 275
Database
ISI
SICI code
1054-9803(1998)8:3<268:E-AATU>2.0.ZU;2-6
Abstract
Large numbers of expressed sequence tags (ESTs) continue to fill publi c and private databases with partial cDNA sequences. However, using th is huge amount of ESTs to facilitate gene finding in genomic sequence imposes a challenge, especially to wet-lab scientists who often have l imited computing resources. In an effort to consolidate the informatio n hidden in the vast number of ESTs into a readable and manageable for mat, we have developed EbEST-a program that automates the process of u sing ESTs to help delineate gene structure in long stretches of genomi c sequence. The EbEST program consists of three Functional modules-the First module separates homologous ESTs into clusters and identifies t he most informative ESTs within each cluster; the second module uses t he informative ESTs to perform gapped alignment and to predict tile ex on-intron boundary; and the third module generates text file and graph ic outputs that illustrate the orientation, exonic structure, and untr anslated regions [UTRs] of putative genes in the genomic sequence bein g analyzed. Evaluation of EbEST with 176 human genes from the ALLSEQ s et indicated that it performed in-line with several existing gene find ing programs, but was more tolerant to sequencing errors. Furthermore, when EbEST was challenged with query sequences that harbor more than one gene, it suffered only a slight drop in performance, whereas the p erformance of the other programs evaluated decreased more. EbEST may b e used as a stand-alone tool to annotate human genomic sequences with EST-derived gene elements, or can be used in conjunction with computat ional gene-recognition programs to increase the accuracy of gene predi ction.