Theory is developed for the process of sequencing randomly selected large-i
nsert clones. Genome size, library depth, clone size, and clone distributio
n are considered relevant properties and perfect overlap detection For cont
ig assembly is assumed. Genome-specific and nonrandom effects are neglected
. Order of magnitude analysis indicates library depth is of secondary impor
tance compared to the other variables, especially as clone size diminishes.
In such cases, the well-known Poisson coverage law is a good approximation
. Parameters derived from these models ale used to examine performance For
the specific case of sequencing random human BAC clones. We compare coverag
e and redundancy rates for libraries possessing uniform and nonuniform clon
e distributions. Results are measured against data from map-based human-chr
omosome-2 sequencing. We conclude that the map-based approach outperforms r
andom clone sequencing, except early in a project. However, simultaneous us
e of both strategies can be beneficial if a performance-based estimate for
halting random clone sequencing is made. Results Further show that the rand
om approach yields maximum effectiveness using nonbiased rather than biased
libraries.