Counts of long aligned word matches among random letter sequences

Citation
Karlin, Samuel et Ost, Friedemann, Counts of long aligned word matches among random letter sequences, Advances in applied probability , 19(2), 1987, pp. 293-351
ISSN journal
00018678
Volume
19
Issue
2
Year of publication
1987
Pages
293 - 351
Database
ACNP
SICI code
Abstract
Asymptotic distributional properties of the maximal length aligned word (a contiguous set of letters) among multiple random Markov dependent sequences composed of letters from a finite alphabet are given. For sequences of length N, Cr,s(N) defined as the longest common aligned word found in r or more of s sequences has order growth log N/(.log.) where .is the maximal eigenvalue of r-Schur product matrices from among the collections of Markov matrices that generate the sequences. The count Z.r,s(N, k) of positions that initiate an aligned match of length exceeding k = log N/(.log.) + x but fail to match at the immediately preceding position has a limiting Poisson distribution. Distributional properties of other long aligned word relationships and patterns are also discussed.