GENERALIZED AFFINE GAP COSTS FOR PROTEIN-SEQUENCE ALIGNMENT

Authors
Citation
Sf. Altschul, GENERALIZED AFFINE GAP COSTS FOR PROTEIN-SEQUENCE ALIGNMENT, Proteins, 32(1), 1998, pp. 88-96
Citations number
68
Categorie Soggetti
Biology,"Genetics & Heredity
Journal title
ISSN journal
08873585
Volume
32
Issue
1
Year of publication
1998
Pages
88 - 96
Database
ISI
SICI code
0887-3585(1998)32:1<88:GAGCFP>2.0.ZU;2-0
Abstract
Based on the observation that a single mutational event can delete or insert multiple residues, affine gap costs for sequence alignment char ge a penalty for the existence of a gap, and a further length-dependen t penalty. From structural or multiple alignments of distantly related proteins, it has been observed that conserved residues frequently fal l into ungapped blocks separated by relatively nonconserved regions. T o take advantage of this structure, a simple generalization of affine gap costs is proposed that allows nonconserved regions to be effective ly ignored. The distribution of scores from local alignments using the se generalized gap costs is shown empirically to follow an extreme val ue distribution. Examples are presented for which generalized affine g ap costs yield superior alignments from the standpoints both of statis tical significance and of alignment accuracy. Guidelines for selecting generalized affine gap costs are discussed, as is their possible appl ication to multiple alignment. Proteins 32:88-96, 1998. (C) 1998 Wiley -Liss, Inc.dagger.