Large-scale sequencing projects are widening the gap between the known prot
ein universe and the fraction for which structural information has been exp
erimentally obtained. Through the application of homology (comparative) mod
eling and more general structure prediction techniques, this gap can, howev
er, be narrowed, providing indirect structural information for a considerab
le number of proteins. Moreover, the estimated number of existing protein f
olds seems to be limited and many of these yet unknown folds should be disc
overed by dedicated large-scale structural genomics projects. Within this p
erspective, homology (comparative) modeling will gain in importance, as wil
l the use of models derived by this technique. Here we discuss how well a s
equence alignment, the most common starting point for generating a model, r
eflects the structural conservation between homologous proteins and we show
that sequence information is able to direct construction of acceptable mod
els as far as the structural core is concerned. We also show here that the
regions surrounding insertions and deletions are much less conserved than t
he core and discuss the implications of this observation for loop modeling.
(C) 2001 Academic Press.