The PRINTS database of protein 'fingerprints' is described. Fingerprin
ts comprise sets of moths excised from conserved regions of sequence a
lignments, their diagnostic power or potency being refined by iterativ
e database scanning (in this case the OWL composite sequence database)
. Generally, the motifs do not overlap, but are separated along a sequ
ence, though they may be contiguous in 3-D space. The use of groups of
independent, linearly or spatially separate moths allows particular p
rotein folds and functionalities to be characterized more flexibly and
powerfully than conventional single-component patterns or regular exp
ressions. The current version of the database (4.0) contains 150 entri
es (encoding >700 motifs), covering a wide range of globular and membr
ane proteins, modular polypeptides and so on. The growth of the databa
se is influenced by a number of factors, e.g. the use of multiple moti
fs, the maximization of sequence information through iterative databas
e scanning and the fact that the database searched is a large composit
e. The information contained within PRINTS is distinct from but comple
mentary to the single consensus expressions stored in the widely used
PROSITE dictionary of patterns.