Optimizing path query performance: graph clustering strategies

Citation
Yw. Huang et al., Optimizing path query performance: graph clustering strategies, TRANS RES C, 8(1-6), 2000, pp. 381-408
Citations number
40
Categorie Soggetti
Civil Engineering
Journal title
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES
ISSN journal
0968090X → ACNP
Volume
8
Issue
1-6
Year of publication
2000
Pages
381 - 408
Database
ISI
SICI code
0968-090X(200002/12)8:1-6<381:OPQPGC>2.0.ZU;2-Y
Abstract
Path queries over transportation networks are operations required by many G eographic Information Systems applications. Such networks, typically modele d as graphs composed of nodes and links and represented as link relations, can be very large and hence often need to be stored on secondary storage de vices. Path query computation over such large persistent networks amounts t o high I/O costs due to having to repeatedly bring in links from the link r elation from secondary storage into the main memory buffer for processing. This paper is the first to present a comparative experimental evaluation of alternative graph clustering solutions in order to show their effectivenes s in path query processing over transportation networks. Clustering optimiz ation is attractive because it does not incur any run-time cost, requires n o auxiliary data structures, and is complimentary to many of the existing s olutions on path query processing. In this payer, we develop a novel cluste ring technique, called spatial partition clustering (SPC), that exploits un ique properties of transportation networks such as spatial coordinates and high locality. We identify other promising candidates for clustering optimi zations from the literature? such as two-way partitioning and approximate t opological clustering. We fine-tune them to optimize their I/O behavior for path query processing. Our experimental evaluation of the performance of t hese graph clustering techniques using an actual city road network as well as randomly generated graphs considers variations in parameters such as mem ory buffer size, length of the paths, locality, and out-degree. Our experim ental results are the foundation for establishing guidelines to select the best clustering technique based on the type of networks. We rind that our S PC performs the best for the highly interconnected city map; the hybrid app roach for random graphs with high locality: and the two-way partitioning ba sed on link weights for random graphs with no locality. (C) 2000 Elsevier S cience Ltd. All rights reserved.