FINDING WORDS WITH UNEXPECTED FREQUENCIES IN DEOXYRIBONUCLEIC-ACID SEQUENCES

Citation
B. Prum et al., FINDING WORDS WITH UNEXPECTED FREQUENCIES IN DEOXYRIBONUCLEIC-ACID SEQUENCES, Journal of the Royal Statistical Society. Series B: Methodological, 57(1), 1995, pp. 205-220
Citations number
11
Categorie Soggetti
Statistic & Probability","Statistic & Probability
Journal title
Journal of the Royal Statistical Society. Series B: Methodological
ISSN journal
00359246 → ACNP
Volume
57
Issue
1
Year of publication
1995
Pages
205 - 220
Database
ISI
SICI code
1369-7412(1995)57:1<205:FWWUFI>2.0.ZU;2-F
Abstract
Considering a Markov chain model for deoxyribonucleic acid sequences, this paper proposes two asymptotically normal statistics to test wheth er the frequency of a given word is concordant with the first-order Ma rkov chain model or not. The problem is to choose estimates mu(W) of t he expectation of the frequency M(W) of a word W in the observed seque nce such that the asymptotic variance of M(W) - mu(W) is easily comput able. The first estimator is derived from the frequency of W[-1], whic h is W with its last letter deleted. The second, following an idea of Cowan, is the conditional expectation M(W) given the observed frequenc ies of all two-letter words. Two examples on phage lambda and phage T7 are shown.