Ai. Archakov et al., CLUSTERIZATION OF P450 SUPERFAMILY USING THE OBJECTIVE PAIR ALIGNMENTMETHOD AND THE UPGMA PROGRAM, JOURNAL OF MOLECULAR MODELING, 4(7), 1998, pp. 234-238
DNA translation to the protein sequences determines the common usage o
f gene name as the enzyme identifier. The previously constructed singl
e- family-member phylogenetic trees are produced by the pair alignment
. The alignments strictly depend upon the user-defined parameters and
algorithmic peculiarities, such as but not limited to: homology matrix
, initial gap penalty value and gap elongation function. This rises th
e necessity to create complete clusterization which reflects the prote
in primary structure relationships. This protein-based clusterization
should be made using the objective pair alignment. The standard dynami
c alignment procedure is modified in order to discriminate between the
suboptimal resulting scores. The special function treats the presence
of continuous matching n-tuples as a good property of alignment. Pair
alignment is objectified by finding the optimal gap penalty, that all
ows to get the maximal difference in identity between random and relat
ive sequences. The method is applied to the cytochrome P450 superfamil
y. Our sample also contained 15 nitric oxide synthases and 30 random s
equences. The similarity matrix, obtained by objective pair alignment,
is worked up by standard UPGMA method.