Uniform integration of genome mapping data using intersection graphs

Citation
E. Harley et al., Uniform integration of genome mapping data using intersection graphs, BIOINFORMAT, 17(6), 2001, pp. 487-494
Citations number
29
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
17
Issue
6
Year of publication
2001
Pages
487 - 494
Database
ISI
SICI code
1367-4803(200106)17:6<487:UIOGMD>2.0.ZU;2-A
Abstract
Motivation: The methods for analyzing overlap data are distinct from those for analyzing probe data, making integration of the two forms awkward. Conv ersion of overlap data to probe-like data elements would facilitate compari son and uniform integration of overlap data and probe data using software d eveloped for analysis of STS data. Results: We show that overlap data can be effectively converted to probe-li ke data elements by extracting maximal sets of mutually overlapping clones. We call these sets virtual probes, since each set determines a site in the genome corresponding to the region which is common among the clones of the set. Finding the virtual probes is equivalent to finding the maximal cliqu es of a graph. We modify a known maximal-clique algorithm such that it find s all virtual probes in a large dataset within minutes. We illustrate the a lgorithm by converting fingerprint and Alu-PCR overlap data to virtual prob es. The virtual probes are then analyzed using double-linkage intersection graphs and structure graphs to show that methods designed for STS data are also applicable to overlap data represented as virtual probes. Next we show that virtual probes can produce a uniform integration of different kinds o f mapping data, in particular STS probe data and fingerprint and Alu-PCR ov erlap data. The integrated virtual probes produce longer double-linkage con tigs than STS probes alone, and in conjunction with structure graphs they f acilitate the identification and elimination of anomalies. Thus, the virtua l-probe technique provides: (i) a new way to examine overlap data; (ii) a b asis on which to compare overlap data and probe data using the same systems and standards; and (iii) a unique and useful way to uniformly integrate ov erlap data with probe data.