Discovering common stem-loop motifs in unaligned RNA sequences

Citation
J. Gorodkin et al., Discovering common stem-loop motifs in unaligned RNA sequences, NUCL ACID R, 29(10), 2001, pp. 2135-2144
Citations number
56
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
29
Issue
10
Year of publication
2001
Pages
2135 - 2144
Database
ISI
SICI code
0305-1048(20010515)29:10<2135:DCSMIU>2.0.ZU;2-P
Abstract
Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combinat ion of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While se veral methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much mo re difficult because of covariation in the positions. We describe the combi ned use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem-loop RNA motifs in unaligned seq uences, such as UTRs from posttranscriptionally coregulated genes. We evalu ate the method on two datasets, one a section of rRNA genes with randomly t runcated ends so that a global alignment is not possible, and the other a h yper-variable collection of IRE-like elements that were inserted into rando mized UTR sequences. In both cases the combined method identified the motif s correctly, and in the rRNA example we show that it is capable of determin ing the structure, which includes bulge and internal loops as well as a var iable length hairpin loop. Those automated results are quantitatively evalu ated and found to agree closely with structures contained in curated databa ses, with correlation coefficients up to 0.9. A basic server, Stem-Loop Ali gn SearcH (SLASH), which will perform stem-loop searches in unaligned RNA s equences, is available at http://www.bioinf.au.dk/slash/.