Cp. Ponting et al., Novel protein domains and repeats in Drosophila melanogaster: Insights into structure, function, and evolution, GENOME RES, 11(12), 2001, pp. 1996-2008
Sequence database searching methods such as BLAST, are invaluable for predi
cting molecular function oil the basis Of Sequence similarities among singl
e regions of proteins. Searches of whole databases however, are not optimiz
ed to detect multiple homologous regions within a single polypeptide. Here
we have used the prospero algorithm to perform self-comparisons of all pred
icted Drosophila melanogaster gene products. Predicted repeats, and their h
omologs from all species, were analyzed further to detect hitherto unapprec
iated evolutionary relationships. Results included the identification of no
vel tandem repeats in the human X-linked retinitis pigmentosa type-2 gene p
roduct, repeated segments in cystinosin, associated with a defect in cystin
e transport, and 'nested' homologous domains in dysferlin, whose gene is mu
tated in limb girdle muscular dystrophy. Novel signaling domain families we
re found that may regulate the microtubule-based cytoskeleton and ubiquitin
-mediated proteolysis, respectively. Two families of glycosyl hydrolases we
re shown to contain internal repetitions that hint at their evolution via a
piecemeal, modular approach. In addition, three examples of fruit fly gene
s were detected with tandem exons that appear to have arisen via internal d
uplication. These findings demonstrate how completely sequenced genomes cal
l be exploited to further understand the relationships between molecular st
ructure, function, and evolution.