ITA
ENG

The efficient computation of ownership sets in HPF

Authors

Joisha, PG Banerjee, P

Citation

Pg. Joisha et P. Banerjee, The efficient computation of ownership sets in HPF, IEEE PARALL, 12(8), 2001, pp. 769-788

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

ISSN journal

10459219 → ACNP

Volume

Issue

Year of publication

2001

Pages

769 - 788

Database

ISI

SICI code

1045-9219(200108)12:8<769:TECOOS>2.0.ZU;2-Z

Abstract

Ownership sets are fundamental to the partitioning of program computations across processors by the owner-computes rule. These sets arise due to the m apping of arrays onto processors. In this paper, we focus on how ownership sets can be efficiently determined in the context of the HPF language and s how how the structure of these sets can be symbolically characterized in th e presence of arbitrary array alignment and array distribution directives. Our starting point is a system of equalities and inequalities due to Ancour t et al. [1] that captures the array mapping problem in HPF. We arrive at a refined system that enables us to efficiently solve for the ownership set using the Fourier-Motzkin Elimination technique and that requires the cours e vector as the only auxiliary vector. The formulation makes it possible to enumerate the elements of the ownership set exactly once, a feature that i s very beneficial when such sets are applied to handle Do loops qualified b y HPF's INDEPENDENT directive. We develop important and general properties pertaining to HPF alignments and distributions and show how they can be use d to eliminate redundant communication due to array replication. Polynomial -time schemes that determine whether the ownership set of a particular proc essor, with respect to some array, is the empty set or whether the ownershi p set of every processor, with respect to some array, is the empty set, are presented. We show how distribution directives with unspecified processor meshes can be efficiently handled at compile time. We also show how to avoi d the generation of communication code when pairs of array references are u ltimately mapped onto the same processors. Experimental data demonstrating the improved code performance that the latter optimization enables is prese nted and discussed.