Directed evolution experiments rely on the cyclical application of mutagene
sis, screening and amplification in a test tube. They have led to the creat
ion of novel proteins for a wide range of applications. However, directed e
volution currently requires an uncertain, typically large, number of labor
intensive and expensive experimental cycles before proteins with improved f
unction are identified. This paper introduces predictive models for quantif
ying the outcome of the experiments aiding in the setup of directed evoluti
on for maximizing the chances of obtaining DNA sequences encoding enzymes w
ith improved activities. Two methods of DNA manipulation are analysed. erro
r-prone PCR and DNA recombination. Error-prone PCR is a DNA replication pro
cess that intentionally introduces copying errors by imposing mutagenic rea
ction conditions. The proposed model calculates the probability of producin
g a specific nucleotide sequence after a number of PCR cycles. DNA recombin
ation methods rely on the mixing and concatenation of genetic material from
a number of parent sequences. This paper focuses on modeling a specific DN
A recombination protocol, DNA shuffling. Three aspects of the DNA shuffling
procedure are modeled: the fragment size distribution after random fragmen
tation by DNase I, the assembly of DNA fragments, and the probability of as
sembling specific sequences or combinations of mutations. Results obtained
with the proposed models compare favorably with experimental data. (C) 2000
Academic Press.