THE reliable construction of evolutionary trees from nucleotide sequen
ces often depends on randomization tests such as the boot-strap1 and P
TP (cladistic permutation tail probability) tests2-6. The genomes of b
acteria7, viruses8, animals7,9,10 and plants11, however, vary widely i
n their nucleotide frequencies. Where genomes have independently acqui
red similar G+C base compositions, signals in the data arise that caus
e methods of evolutionary tree reconstruction to estimate the wrong tr
ee by grouping together sequences with similar G+C content12-14. Under
these conditions randomization tests can lead to both the rejection o
f the correct evolutionary hypothesis and acceptance of an incorrect h
ypothesis (such as with the contradictory inferences from the photosyn
thetic rbcS and rbcL sequences14). We have proposed one approach to te
sting for the G+C content problem15. Here we present a formalization o
f this method, a frequency-dependent significance test, which has gene
ral application.