THE CHIMERIC MAPPING PROBLEM - ALGORITHMIC STRATEGIES AND PERFORMANCEEVALUATION ON SYNTHETIC GENOMIC DATA

Citation
D. Greenberg et S. Istrail, THE CHIMERIC MAPPING PROBLEM - ALGORITHMIC STRATEGIES AND PERFORMANCEEVALUATION ON SYNTHETIC GENOMIC DATA, Computers & chemistry, 18(3), 1994, pp. 207-220
Citations number
26
Categorie Soggetti
Computer Application, Chemistry & Engineering",Chemistry,"Computer Science Interdisciplinary Applications
Journal title
ISSN journal
00978485
Volume
18
Issue
3
Year of publication
1994
Pages
207 - 220
Database
ISI
SICI code
0097-8485(1994)18:3<207:TCMP-A>2.0.ZU;2-B
Abstract
The Human Genome Project requires better software for the creation of physical maps of chromosomes. Current mapping techniques involve break ing large segments of DNA into smaller, more-manageable pieces, gather ing information on all the small pieces, and then constructing a map o f the original large piece from the information about the small pieces . Unfortunately, in the process of breaking up the DNA some informatio n is lost and noise of various types is introduced; in particular, the order of the pieces is not preserved. Thus, the map maker must solve a combinatorial problem in order to reconstruct the map. Good software is indispensable for quick, accurate reconstruction. The reconstructi on is complicated by various experimental errors. A major source of di fficulty-which sems to be inherent to the recombination technology-is the presence of chimeric DNA clones. It is fairly common for two disjo int DNA pieces to form a chimera, i.e. a fusion of two pieces which ap pears as a single piece. Attempts to order chimera will fail unless th ey are algorithmically divided into their constituent pieces. Despite consensus within the genomic mapping community of the critical importa nce of correcting chimerism, algorithms for solving the chimeric clone problem have received only passing attention in the literature. Based on a model proposed by Lander (1992a, b) this paper presents the firs t algorithms for analyzing chimersim. We construct physical maps in th e presence of chimerism by creating optimization functions which have minimizations which correlate with map quality. Despite the fact that these optimization functions are invariably NP-complete our algorithms are guaranteed to produce solutions which are close to the optimum. T he practical import of using these algorithms depends on the strength of the correlation of the function to the map quality as well as on th e accuracy of the approximations. We employ two fundamentally differen t optimization functions as a means of avoiding biases likely to decor relate the solutions from the desired map. Experiments on simulated da ta show that both our algorithm which minimizes the number of chimeric fragments in a solution and our algorithm which minimizes the maximum number of fragments per clone in a solution do, in fact, correlate to high quality solutions. Furthermore, tests on simulated data using pa rameters set to mimic real experiments show that the algorithms have t he potential to find high quality solutions with real data. We plan to test our software against real data from the Whitehead Institute and from Los Alamos Genomic Research Center in the near future.