Eukaryotic ribosomes are made of two components, four ribosomal RNAs, and a
pproximately 80 ribosomal proteins (r-proteins). The exact number of r-prot
eins and r-protein genes in higher plants is not known. The strong conserva
tion in eukaryotic r-protein primary sequence allowed us to use the well-ch
aracterized rat (Rattus norvegicus) r-protein set to identify orthologues o
n the five haploid chromosomes of Arabidopsis. By use of the numerous expre
ssed sequence tag (EST) accessions and the complete genomic sequence of thi
s species, we identified 249 genes (including some pseudogenes) correspondi
ng to 80 (32 small subunit and 48 large subunit) cytoplasmic r-protein type
s. None of the r-protein genes are single copy and most are encoded by thre
e or four expressed genes, indicative of the internal duplication of the Ar
abidopsis genome. The r-proteins are distributed throughout the genome. Ins
pection of genes in the vicinity of r-protein gene family members confirms
extensive duplications of large chromosome fragments and sheds light on the
evolutionary history of the Arabidopsis genome. Examination of large dupli
cated regions indicated that a significant fraction of the r-protein genes
have been either lost from one of the duplicated fragments or inserted afte
r the initial duplication event. Only 52 r-protein genes lack a matching ES
T accession, and 19 of these contain incomplete open reading frames, confir
ming that most genes are expressed. Assessment of cognate EST numbers sugge
sts that r-protein gene family members are differentially expressed.