Sequencing a genome by walking with clone-end sequences: A mathematical analysis

Citation
S. Batzoglou et al., Sequencing a genome by walking with clone-end sequences: A mathematical analysis, GENOME RES, 9(12), 1999, pp. 1163-1174
Citations number
19
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
9
Issue
12
Year of publication
1999
Pages
1163 - 1174
Database
ISI
SICI code
1088-9051(199912)9:12<1163:SAGBWW>2.0.ZU;2-H
Abstract
One approach to sequencing a large genome is (1) to sequence a collection o f nonoverlapping "seeds" chosen from a genomic library of large-insert clon es [such as bacterial artificial chromosomes (BACs)] and then (2) to take s uccessive "walking" steps by selecting and sequencing minimally overlapping clones, using information such as clone-end sequences to identify the over laps. In this paper we analyze the strategic issues involved in using this approach. We derive formulas showing how two key factors, the initial densi ty of seed clones and the depth of the genomic library used For walking, af fect the cost and time of a sequencing project-that is, the amount of redun dant sequencing and the number of steps to cover the vast majority of the g enome. We also discuss a variant strategy in which a second genomic library with clones having a somewhat smaller insert size is used to close gaps. T his approach can dramatically decrease the amount of redundant sequencing, without affecting the rate at which the genome is covered.