CHARACTERIZING AND SCHEDULING COMMUNICATION INTERACTIONS OF PARALLEL AND LOCAL JOBS ON NETWORKS OF WORKSTATIONS

Citation
Yf. Dong et al., CHARACTERIZING AND SCHEDULING COMMUNICATION INTERACTIONS OF PARALLEL AND LOCAL JOBS ON NETWORKS OF WORKSTATIONS, Computer communications, 21(5), 1998, pp. 470-484
Citations number
17
Categorie Soggetti
Computer Science Software Graphycs Programming","Computer Science Hardware & Architecture","Computer Science Information Systems","Computer Science Information Systems","Computer Science Hardware & Architecture","Computer Science Software Graphycs Programming
Journal title
ISSN journal
01403664
Volume
21
Issue
5
Year of publication
1998
Pages
470 - 484
Database
ISI
SICI code
0140-3664(1998)21:5<470:CASCIO>2.0.ZU;2-N
Abstract
Networks of workstations (NOWS) are cost-effective platforms to perfor m parallel computation. Usually, a NOW is not dedicated to parallel jo bs. Local users may run some applications in their workstations which involve communications as well. This paper examines the effects of com munication interactions of parallel and local jobs on a nondedicated N OW. Three representative communication patterns of parallel jobs are c onsidered. A quantitative model to characterize the interactions is pr oposed. Measurement results on a NOW support the analytical model and indicate that the network interface in the TCP/IP protocol forms a com munication bottleneck during interactions because a standard network i nterface with a single input/output queue is not able to distinguish c ommunication requests from parallel and local jobs. Therefore, small b ut important communication messages of a parallel job, such as a barri er synchronization, could be easily blocked by a communication request of a local job, which would degrade the performance of the parallel j ob significantly. A double queue scheme in the network interface is pr oposed. Using available information from the protocol layer, the schem e is able to distinguish the two types of communication requests and g ive a higher priority to parallel jobs' communication requests. The si mulation results show that the scheme could improve the performance of parallel jobs without significantly affecting the performance of loca l jobs. (C) 1998 Elsevier Science B.V.