Detection of deleted genomic DNA using a semiautomated computational analysis of GeneChip data

Citation
H. Salamon et al., Detection of deleted genomic DNA using a semiautomated computational analysis of GeneChip data, GENOME RES, 10(12), 2000, pp. 2044-2054
Citations number
7
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
10
Issue
12
Year of publication
2000
Pages
2044 - 2054
Database
ISI
SICI code
1088-9051(200012)10:12<2044:DODGDU>2.0.ZU;2-U
Abstract
Genomic diversity within and between populations is caused by single nucleo tide mutations, changes in repetitive DNA systems, recombination mechanisms , and insertion and deletion events. The contribution of these sources to d iversity, whether purely genetic or of phenotypic consequence, can only be investigated if we have the means to quantitate and characterize diversity in many samples. With the advent of complete sequence characterization of r epresentative genomes of different species, the possibility of developing p rotocols to screen for genetic polymorphism across entire genomes is active ly being pursued. The large numbers of measurements such approaches yield d emand that we pay careful attention to the numerical analysis of data, in t his paper we present a novel application of an Affymetrix GeneChip to perfo rm genome-wide screens for deletion polymorphism. A high-density oligonucle otide array formatted for mRNA expression and targeted at a fully sequenced 4.4-million-base pair Mycobacterium tuberculosis standard strain genome wa s adapted to compare genomic DNA. Hybridization intensities to 111,000 prob e pairs (perfect complement and mismatch complement) were measured for geno mic DNA from a clinical strain and from a vaccine organism. Because individ ual probe-pair hybridization intensities exhibit limited sensitivity/specif icity characteristics to detect deletions, data-analytical methodology to e xploit measurements from multiple probes in tandem locations across the gen ome was developed. The TSTEP (Tandem Set Terminal Extreme Probability) algo rithm designed specifically to analyze the tandem hybridization measurement s data was applied and shown to discover genomic deletions with high sensit ivity. The TSTEP algorithm provides a foundation for similar efforts to cha racterize deletions in many hybridization measures in similar-sized and lar ger genomes. Issues relating to the design of genome content screening expe riments and the implications of these methods for studying population genom ics and the evolution of genomes are discussed.