Post-transcriptional regulation of gene expression is often accomplished by
proteins binding to specific sequence motifs in mRNA molecules, to affect
their translation or stability. The motifs are often composed of a combinat
ion of sequence and structural constraints such that the overall structure
is preserved even though much of the primary sequence is variable. While se
veral methods exist to discover transcriptional regulatory sites in the DNA
sequences of coregulated genes, the RNA motif discovery problem is much mo
re difficult because of covariation in the positions. We describe the combi
ned use of two approaches for RNA structure prediction, FOLDALIGN and COVE,
that together can discover and model stem-loop RNA motifs in unaligned seq
uences, such as UTRs from posttranscriptionally coregulated genes. We evalu
ate the method on two datasets, one a section of rRNA genes with randomly t
runcated ends so that a global alignment is not possible, and the other a h
yper-variable collection of IRE-like elements that were inserted into rando
mized UTR sequences. In both cases the combined method identified the motif
s correctly, and in the rRNA example we show that it is capable of determin
ing the structure, which includes bulge and internal loops as well as a var
iable length hairpin loop. Those automated results are quantitatively evalu
ated and found to agree closely with structures contained in curated databa
ses, with correlation coefficients up to 0.9. A basic server, Stem-Loop Ali
gn SearcH (SLASH), which will perform stem-loop searches in unaligned RNA s
equences, is available at http://www.bioinf.au.dk/slash/.