ITA
ENG

A SEQUENCE PROPERTY APPROACH TO SEARCHING PROTEIN DATABASES

Authors

HOBOHM U SANDER C

Citation

U. Hobohm et C. Sander, A SEQUENCE PROPERTY APPROACH TO SEARCHING PROTEIN DATABASES, Journal of Molecular Biology, 251(3), 1995, pp. 390-399

Citations number

Categorie Soggetti

Biology

Journal title

Journal of Molecular Biology → ACNP

ISSN journal

00222836

Volume

251

Issue

Year of publication

1995

Pages

390 - 399

Database

ISI

SICI code

0022-2836(1995)251:3<390:ASPATS>2.0.ZU;2-P

Abstract

Currently available sequence alignment programs are generally not capa ble of detecting functional and structural homologs in the twilight zo ne of sequence similarity, i.e. when the sequence identity falls below about 25%. Here we attempt to detect such weak similarities using an approach based on a notion of protein sequence similarity radically di fferent from that used in sequential alignment. The approach defines p rotein sequence dissimilarity (or distance) as a weighted sum of diffe rences of compositional properties such as singlet and doublet amino a cid composition, molecular weight, isoelectric point (protein property search or PropSearch). With PropSearch, either single sequences can b e used for a database query, or multiple sequences can be merged into an ''average'' sequence reflecting the average composition of a protei n family. First, we show that members of structural protein families h ave a low mutual PropSearch distance when the weights are optimized to discriminate maximally between structural families. Second, we demons trate the results of database searches using the PropSearch method. Su ch searches are very rapid when scanning a preprocessed database and d o not require alignments. In cases in which conventional alignment too ls fail to detect similarities, PropSearch can be used to generate hyp otheses about possible structural or functional relationships between a new sequence and sequences in the database.