A STATISTICAL-MODEL FOR LOCATING REGULATORY REGIONS IN GENOMIC DNA

Citation
Em. Crowley et al., A STATISTICAL-MODEL FOR LOCATING REGULATORY REGIONS IN GENOMIC DNA, Journal of Molecular Biology, 268(1), 1997, pp. 8-14
Citations number
31
Categorie Soggetti
Biology
ISSN journal
00222836
Volume
268
Issue
1
Year of publication
1997
Pages
8 - 14
Database
ISI
SICI code
0022-2836(1997)268:1<8:ASFLRR>2.0.ZU;2-J
Abstract
In addition to genes, chromosomal DNA contains sequences that serve as signals for turning on and off gene expression, These signals are tho ught to be distributed as clusters in the regulatory regions of genes. We develop a Bayesian model that views locating regulatory regions in genomic DNA as a change-point problem, with the beginning of regulato ry and non-regulatory regions corresponding to the change points. The model is based on a hidden Markov chain. The data consist of nucleotid e positions of protein-binding elements in a genomic DNA sequence. The se positions are identified using a reference catalogue containing ele ments that interact with transcription factors implicated in controlli ng the expression of protein-encoding genes. Among the protein-binding elements in a genomic DNA sequence, the statistical model automatical ly selects those that tend to predict regulatory regions. We test the model using viral sequences that include known regulatory legions and provide the results obtained for human genomic DNA corresponding to th e beta globin locus on chromosome 11. (C) 1997 Academic Press Limited.