Basic gene grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences

Citation
Sw. Leung et al., Basic gene grammars and DNA-ChartParser for language processing of Escherichia coli promoter DNA sequences, BIOINFORMAT, 17(3), 2001, pp. 226-236
Citations number
31
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
17
Issue
3
Year of publication
2001
Pages
226 - 236
Database
ISI
SICI code
1367-4803(200103)17:3<226:BGGADF>2.0.ZU;2-N
Abstract
Motivation: The field of 'DNA linguistics' has emerged from pioneering work in computational linguistics and molecular biology. Most formal grammars i n this field are expressed using Definite Clause Grammars but these have co mputational limitations which must be overcome. The present study provides a new DNA parsing system, comprising a logic grammar formalism called Basic Gene Grammars and a bidirectional chart parser DNA-ChartParser. Results: The use of Basic Gene Grammars is demonstrated in representing man y formulations of the knowledge of Escherichia coli promoters, including kn owledge acquired from human experts, consensus sequences, statistics (weigh t matrices), symbolic learning, and neural network learning. The DNA-ChartP arser provides bidirectional parsing facilities for BGGs in handling overla pping categories, gap categories, approximate pattern matching, and constra ints. Basic Gene Grammars and the DNA-ChartParser allowed different sources of knowledge for recognizing E,coli promoters to be combined to achieve be tter accuracy as assessed by parsing these DNA sequences in real-world data sets. Availability: DNA-ChartParser runs under SICStus Prolog. It and a few examp les of Basic Gene Grammars are available at the URL: http://www.dai.ed.ac.u k/-siu/DNA.