B. Prum et al., FINDING WORDS WITH UNEXPECTED FREQUENCIES IN DEOXYRIBONUCLEIC-ACID SEQUENCES, Journal of the Royal Statistical Society. Series B: Methodological, 57(1), 1995, pp. 205-220
Citations number
11
Categorie Soggetti
Statistic & Probability","Statistic & Probability
Journal title
Journal of the Royal Statistical Society. Series B: Methodological
Considering a Markov chain model for deoxyribonucleic acid sequences,
this paper proposes two asymptotically normal statistics to test wheth
er the frequency of a given word is concordant with the first-order Ma
rkov chain model or not. The problem is to choose estimates mu(W) of t
he expectation of the frequency M(W) of a word W in the observed seque
nce such that the asymptotic variance of M(W) - mu(W) is easily comput
able. The first estimator is derived from the frequency of W[-1], whic
h is W with its last letter deleted. The second, following an idea of
Cowan, is the conditional expectation M(W) given the observed frequenc
ies of all two-letter words. Two examples on phage lambda and phage T7
are shown.