Parallel functional languages use single valued variables to avoid sem
antically irrelevant data dependence constraints. Programs containing
iterations that redefine variables in a procedural language have the c
orresponding variables declared with additional dimensions in a single
assignment language. This extra temporal dimension, unless optimized,
requires an exorbitant amount of memory and in parallel programs impo
ses a large delay between the data producer and consumers. For certain
loop arrangements, a window containing a few elements of the dimensio
n can be created. Usually, there are many ways for defining a loop arr
angement in an implementation of a functional program and a trade-off
between the memory saving and the needed level of parallelism has to b
e taken into account when selecting the implementation. In this paper
we prove that the problem of determining the best loop arrangement by
partitioning the dependence graph is NP-hard. In addition, we describe
a heuristic for solving this problem. Finally, we present examples of
parallel functional programs in which the memory optimization results
in reducing the local and shared memory requirements and communicatio
n delays.