The predicted proteins of the genome of Caenorhabditis elegans were analyse
d by various sequence comparison methods to identify the repertoire of prot
eins that are members of the immunoglobulin superfamily (IgSF). The IgSF is
one of the largest families of protein domain in this genome and likely to
be one of the major families in other multicellular eukaryotes too. This i
s because members of the superfamily are involved in a variety of functions
including cell-cell recognition, cell-surface receptors, muscle structure
and, in higher organisms, the immune system. Sixty-four proteins with 488 I
set IgSF domains were identified largely by using Hidden Markov models. Th
e domain architectures of the protein products of these 64 genes are descri
bed. Twenty-one of these had been characterised previously. We show that an
other 25 are related to proteins of known function. The C. elegans IgSF pro
teins can be classified into five broad categories: muscle proteins, protei
n kinases and phosphatases, three categories of proteins involved in the de
velopment of the nervous system, leucine-rich repeat containing proteins an
d proteins without homologues of known function, of which there are 18. The
19 proteins involved in nervous system development that are not kinases or
phosphatases are homologues of neuroglian, axonin, NCAM, wrapper, klingon,
ICCR and nephrin or belong to the recently identified zig gene family. Out
of the set of 64 genes, 22 are on the X chromosome. This study should be s
een as an initial description of the IgSF repertoire in C, elegans, because
the current gene definitions may contain a number of errors, especially in
the case of long sequences, and there may be IgSF genes that have not yet
been detected. However, the proteins described here do provide an overview
of the bulk of the repertoire of immunoglobulin superfamily members in C. e
legans, a framework for refinement and extension of the repertoire as gene
and protein definitions improve, and the basis for investigations of their
function and for comparisons with the repertoires of other organisms. (C) 2
000 Academic Press.