AN EFFICIENT STATISTIC TO DETECT OVER-REPRESENTED AND UNDER-REPRESENTED WORDS IN DNA-SEQUENCES

Authors
Citation
S. Schbath, AN EFFICIENT STATISTIC TO DETECT OVER-REPRESENTED AND UNDER-REPRESENTED WORDS IN DNA-SEQUENCES, Journal of computational biology, 4(2), 1997, pp. 189-192
Citations number
6
Categorie Soggetti
Mathematical Methods, Biology & Medicine",Mathematics,Biology,"Biochemical Research Methods",Mathematics,"Biothechnology & Applied Migrobiology
ISSN journal
10665277
Volume
4
Issue
2
Year of publication
1997
Pages
189 - 192
Database
ISI
SICI code
1066-5277(1997)4:2<189:AESTDO>2.0.ZU;2-X
Abstract
In this note, we point out a very efficient statistic to detect over- and under-represented words in DNA sequences, when Markov chain models are used to represent the sequences. This statistic is missing from t he recent review done on this important problem and appears to be a be tter measure of rarity and abundance of words in DNA sequences.