ITA
ENG

An analysis of gene-finding programs for Neurospora crassa

Authors

Kraemer, E Wang, J Guo, JH Hopkins, S Arnold, J

Citation

E. Kraemer et al., An analysis of gene-finding programs for Neurospora crassa, BIOINFORMAT, 17(10), 2001, pp. 901-912

Citations number

Categorie Soggetti

Multidisciplinary

Journal title

BIOINFORMATICS

ISSN journal

13674803 → ACNP

Volume

Issue

Year of publication

2001

Pages

901 - 912

Database

ISI

SICI code

1367-4803(200110)17:10<901:AAOGPF>2.0.ZU;2-T

Abstract

Motivation: Computational gene identification plays an important role in ge nome projects. The approaches used in gene identification programs are ofte n tuned to one particular organism, and accuracy for one organism or class of organism does not necessarily translate to accurate predictions for othe r organisms. In this paper we evaluate five computer programs on their abil ity to locate coding regions and to predict gene structure in Neurospora cr assa. One of these programs (FFG) was designed specifically for gene-findin g in N.crassa, but the model parameters have not yet been fully 'tuned', an d the program should thus be viewed as an initial prototype. The other four programs were neither designed nor tuned for N.crassa. Results: We describe the data sets on which the experiments were performed, the approaches employed by the five algorithms: GenScan, HMMGene, GeneMark , Pombe and FFG, the methodology of our evaluation, and the results of the experiments. Our results show that, while none of the programs consistently performs well, overall the GenScan program has the best performance on sen sitivity and Missing Exons (ME) while the HMMGene and FFG programs have goo d performance in locating the exons roughly. Additional work motivated by t his study includes the creation of a tool for the automated evaluation of g ene-finding programs, the collection of larger and more reliable data sets for N.crassa, parameterization of the model used in FFG to produce a more a ccurate gene-finding program for this species, and a more in-depth evaluati on of the reasons that existing programs generally fail for N.crassa.