Special factors in biological strings

Citation
A. Colosimo et A. De Luca, Special factors in biological strings, J THEOR BIO, 204(1), 2000, pp. 29-46
Citations number
18
Categorie Soggetti
Multidisciplinary
Journal title
JOURNAL OF THEORETICAL BIOLOGY
ISSN journal
00225193 → ACNP
Volume
204
Issue
1
Year of publication
2000
Pages
29 - 46
Database
ISI
SICI code
0022-5193(20000507)204:1<29:SFIBS>2.0.ZU;2-P
Abstract
Biological macromolecules such as DNA, RNA, and proteins can be regarded as finite sequences of symbols (or words) over a finite alphabet. In this pap er, we refer to DNA (RNA) sequences which are words on a four-letter alphab et. A comparison is made between some "genes", or fragments of them, with r andom sequences or random reshuffled sequences on the same alphabet and hav ing the same length. Some combinatorial techniques of analysis of finite wo rds are developed. A crucial role in the comparison is played by the so-cal led special factors of a given word. In all the analysed DNA (RNA) fragment s the distribution on the length of the number of right (left) special fact ors differs, in a very typical way, from the corresponding distribution in a string on the same alphabet and having the same length generated by a ran dom source or obtained by making a random alteration (= shuffling) of the o riginal string. This kind of change is irrespective of the length in the ra nge that we have considered < 2650 bp and of the phylogenetic origin of the fragment. (C) 2000 Academic Press.