P. Petrilli et Nj. Tonukari, PFDB - A PROTEIN FAMILIES DATABASE FOR MACINTOSH COMPUTERS - THE EFFECTIVENESS OF ITS ORGANIZATION IN SEARCHING FOR PROTEIN SIMILARITY, Journal of protein chemistry, 16(7), 1997, pp. 713-720
A protein sequence database (PFDB) containing about 11,000 entries is
available for Macintosh computers. The PFDB can be easily updated by i
mporting sequences from the PIR collection through the internet. The m
ost important feature of the database is its organization in families
of closely related sequences, each family being characterized by its a
verage dipeptide composition [Petrilli (1993), Comput. Appl. Biosci. 2
, 89-93]. This allows one to perform a rapid and sensitive protein sim
ilarity search by comparing the precalculated family dipeptide composi
tion with that of the query sequence by a linear correlation coefficie
nt. An example of an application in which a new protein was classified
by using a sequence of a fragment just 19 residues long is reported.