ITA
ENG

Exploiting parallelism in a structural scientific discovery system to improve scalability

Authors

Galal, GM Cook, DJ Holder, LB

Citation

Gm. Galal et al., Exploiting parallelism in a structural scientific discovery system to improve scalability, J AM S INFO, 50(1), 1999, pp. 65-73

Citations number

Categorie Soggetti

Library & Information Science

Journal title

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE

ISSN journal

00028231 → ACNP

Volume

Issue

Year of publication

1999

Pages

65 - 73

Database

ISI

SICI code

0002-8231(199901)50:1<65:EPIASS>2.0.ZU;2-8

Abstract

The large amount of data collected today is quickly overwhelming researcher s' abilities to interpret the data and discover interesting patterns. Knowl edge discovery and data mining approaches hold the potential to automate th e interpretation process, but these approaches frequently utilize computati onally expensive algorithms. In particular, scientific discovery systems fo cus on the utilization of richer data representation, sometimes without reg ard for scalability. This research investigates approaches for scaling a pa rticular knowledge discovery in databases (KDD) system, SUBDUE, using paral lel and distributed resources. SUBDUE has been used to discover interesting and repetitive concepts in graph-based databases from a variety of domains , but requires a substantial amount of processing time. Experiments that de monstrate scalability of parallel versions of the SUBDUE system are perform ed using CAD circuit databases and artificially-generated databases, and po tential achievements and obstacles are discussed.