COMPOUND POISSON AND POISSON-PROCESS APPROXIMATIONS FOR OCCURRENCES OF MULTIPLE WORDS IN MARKOV-CHAINS

Citation
G. Reinert et S. Schbath, COMPOUND POISSON AND POISSON-PROCESS APPROXIMATIONS FOR OCCURRENCES OF MULTIPLE WORDS IN MARKOV-CHAINS, Journal of computational biology, 5(2), 1998, pp. 223-253
Citations number
20
Categorie Soggetti
Mathematics,Biology,"Biochemical Research Methods",Mathematics,"Biothechnology & Applied Migrobiology
ISSN journal
10665277
Volume
5
Issue
2
Year of publication
1998
Pages
223 - 253
Database
ISI
SICI code
1066-5277(1998)5:2<223:CPAPAF>2.0.ZU;2-5
Abstract
We derive a Poisson process approximation for the occurrences of clump s of multiple words and a compound Poisson process approximation for t he number of occurrences of multiple words in a sequence of letters ge nerated by a stationary Markov chain. Using the Chen-Stein method, we provide a bound on the error in the approximations. For rare words, th ese errors tend to zero as the length of the sequence increases to inf inity. Modeling a DNA sequence as a stationary Markov chain, we show a s an application that the compound Poisson approximation is efficient for the number of occurrences of rare stem-loop motifs.