Lj. Jensen et al., SCORING FUNCTIONS FOR COMPUTATIONAL ALGORITHMS APPLICABLE TO THE DESIGN OF SPIKED OLIGONUCLEOTIDES, Nucleic acids research, 26(3), 1998, pp. 697-702
Protein engineering by inserting stretches of random DNA sequences int
o target genes in combination with adequate screening or selection met
hods is a versatile technique to elucidate and improve protein functio
ns. Established compounds for generating semi-random DNA sequences are
spiked oligonucleotides which are synthesised by interspersing wild t
ype (wt) nucleotides of the target sequence with certain amounts of ot
her nucleotides. Directed spiking strategies reduce the complexity of
a library to a manageable format compared with completely random libra
ries. Computational algorithms render feasible the calculation of appr
opriate nucleotide mixtures to encode specified amino acid subpopulati
ons. The crucial element in the ranking of spiked codons generated dur
ing an iterative algorithm is the scoring function. In this report thr
ee scoring functions are analysed: the sum-of-square-differences funct
ion s, a modified cubic function c,and a scoring function m derived fr
om maximum likelihood considerations. The impact of these scoring func
tions on calculated amino acid distributions is demonstrated by an exa
mple of mutagenising a domain surrounding the active site serine of su
btilisin-like proteases. At default weight settings of one for each am
ino acid, the new scoring function m is superior to functions s and c
in finding matches to a given amino acid population.