ITA
ENG

STATISTICS OF RNA SECONDARY STRUCTURES

Authors

FONTANA W KONINGS DAM STADLER PF SCHUSTER P

Citation

W. Fontana et al., STATISTICS OF RNA SECONDARY STRUCTURES, Biopolymers, 33(9), 1993, pp. 1389-1404

Citations number

Categorie Soggetti

Biology

Journal title

Biopolymers → ACNP

ISSN journal

00063525

Volume

Issue

Year of publication

1993

Pages

1389 - 1404

Database

ISI

SICI code

0006-3525(1993)33:9<1389:SORSS>2.0.ZU;2-E

Abstract

A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random RNA sequenc es. Four nucleotide alphabets are used: two binary alphabets, AU and G C, the biophysical AUGC and the synthetic GCXK alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, jo ints, and free ends. Statistical properties of these elements are comp uted for small RNA molecules of chain lengths up to 100. The results o f RNA structure statistics depend strongly on the particular alphabet chosen. The statistical reference is compared with the data derived fr om natural RNA molecules with similar base frequencies. Secondary stru ctures are represented as trees. Tree editing provides a quantitative measure for the distance d(t), between two structures. We compute a st ructure density surface as the conditional probability of two structur es having distance t given that their sequences have distance h. This surface indicates that the vast majority of possible minimum free ener gy secondary structures occur within a fairly small neighborhood of an y typical (random) sequence. Correlation lengths for secondary structu res in their tree representations are computed from probability densit ies. They are appropriate measures for the complexity of the sequence- structure relation. The correlation length also provides a quantitativ e estimate for the mean sensitivity of structures to point mutations. (C) 1993 John Wiley & Sons, Inc.