The vertebrate olfactory receptor (OR) subgenome harbors the largest known
gene family, which has been expanded by the need to provide recognition cap
acity for millions of potential odorants. We implemented an automated proce
dure to identify all OR coding regions from published sequences. This led u
s to the identification of 831 OR coding regions (including pseudogenes) fr
om 24 vertebrate species. The resulting dataset was subjected to neighbor-j
oining phylogenetic analysis and classified into 32 distinct families, 14 o
f which include only genes from tetrapodan species (Class II ORs). We also
report here the first identification of OR sequences from a marsupial (koal
a) and a monotreme (platypus). Analysis of these OR sequences suggests that
the ancestral mammal had a small OR repertoire, which expanded independent
ly in all three mammalian subclasses. Classification of "fishlike" (Class I
) ORs indicates that some of these ancient ORs were maintained and even exp
anded in mammals.
A nomenclature system for the OR gene superfamily is proposed, based on a d
ivergence evolutionary model. The nomenclature consists of the root symbol
'OR', followed by a family numeral, subfamily letter(s), and a numeral repr
esenting the individual gene within the subfamily. For example, OR3A1 is an
OR gene of family 3, subfamily A, and OR7E12P is an OR pseudogene of famil
y 7, subfamily E. The symbol is to be preceded by a species indicator. We h
ave assigned the proposed nomenclature symbols for all 330 human OR genes i
n the database. A WWW tool for automated name assignment is provided.