J. Schultz et al., SMART, A SIMPLE MODULAR ARCHITECTURE RESEARCH TOOL - IDENTIFICATION OF SIGNALING DOMAINS, Proceedings of the National Academy of Sciences of the United Statesof America, 95(11), 1998, pp. 5857-5864
Accurate multiple alignments of 86 domains that occur in signaling pro
teins have been constructed and used to provide a Web based tool (SMAR
T: simple modular architecture research tool) that allows rapid identi
fication and annotation of signaling domain sequences. The majority of
signaling proteins are multidomain in character with a considerable v
ariety of domain combinations known. Comparison with established datab
ases showed that 25% of our domain set could not be deduced from Swiss
Prot and 41% could not be annotated by Pfam, SMART is able to determin
e the modular architectures of single sequences or genomes; applicatio
n to the entire yeast genome revealed that at least 6.7% of its genes
contain one or more signaling domains, approximately 350 greater than
previously annotated. The process of constructing SMART predicted (i)
novel domain homologues in unexpected locations such as band 4.1-homol
ogous domains in focal adhesion kinases; (ii) previously unknown domai
n families, including a citron-homology domain; (iii) putative functio
ns of domain families after identification of additional family member
s, for example, a ubiquitin-binding role for ubiquitin-associated doma
ins (UBA); (iv) cellular roles for proteins, such predicted DEATH doma
ins in netrin receptors further implicating these molecules in axonal
guidance; (v) signaling domains in known disease genes such as SPRY do
mains in both marenostrin/pyrin and Midline I; (vi) domains in unexpec
ted phylogenetic contexts such as diacylglycerol kinase homologues in
yeast and bacteria; and (vii) likely protein misclassifications exempl
ified by a predicted pleckstrin homology domain in a Candida albicans
protein, previously described as an integrin.