Motivation: TOPS cartoons are a schematic abstraction of protein three-dime
nsional structures in two dimensions, and are used for understanding and ma
nual comparison of protein folds. Recently, an algorithm that produces the
cartoons automatically from protein structures has been devised and cartoon
s have been generated to represent all the structures in the structural dat
abank. There is now a need to be able to define target topological patterns
and to search the database for matching domains.
Results: We have devised a formal language for describing TOPS diagrams and
patterns, and have designed an efficient algorithm to match a pattern to a
set of diagrams. A pattern-matching system has been implemented, and teste
d on a database derived from all the current entries in the Protein Data Ba
nk (15000 domains). Users can search on patterns selected from a library of
motifs or; alternatively, they can define their own search patterns.