NEIGHBOR JOINING AND MAXIMUM-LIKELIHOOD WITH RNA SEQUENCES - ADDRESSING THE INTERDEPENDENCE OF SITES

Citation
Erm. Tillier et Ra. Collins, NEIGHBOR JOINING AND MAXIMUM-LIKELIHOOD WITH RNA SEQUENCES - ADDRESSING THE INTERDEPENDENCE OF SITES, Molecular biology and evolution, 12(1), 1995, pp. 7-15
Citations number
25
Categorie Soggetti
Biology
ISSN journal
07374038
Volume
12
Issue
1
Year of publication
1995
Pages
7 - 15
Database
ISI
SICI code
0737-4038(1995)12:1<7:NJAMWR>2.0.ZU;2-R
Abstract
Intrastrand base pairings give ribosomal and other RNA molecules chara cteristic structures that are important for their function. In order t o maintain these structures, a substitution at one paired site may hav e to be compensated for by an appropriate substitution at the compleme ntary site. Thus paired sites do not evolve independently of one anoth er. Most current methods for inferring phylogeny from molecular sequen ces assume that the sites are independent and will therefore give stat istically unreliable and possibly erroneous results when used on struc tured RNA sequences. We analyze a new probabilistic model for the evol ution of double-stranded RNA molecules that considers substitutions of the base pairs rather than of each of the bases independently. The ne w model, called the double-stranded model, was incorporated into the n eighbor-joining distance and maximum likelihood methods. Computer simu lations show that maximum likelihood is very robust to the violation o f the assumption of the independence of sites. In contrast, the neighb or-joining method is sensitive to such violations: the double-stranded model can provide a significant increase in the chance of obtaining t he correct tree topologies with neighbor joining when distances are la rge and the tree is difficult to obtain. The new model also leads to l ower but more realistic estimates for the statistical confidence in th e branch lengths and tree topologies.