Nearly all personal computer and workstation processors, and virtually all
high-performance embedded processor cores, now embody instruction level par
allel (ILP) processing in the form of superscalar or very long instruction
word (VLIW) architectures. ILP processors put much more of a burden on comp
ilers; without "heroic" compiling techniques, most such processors fall far
short of their performance goals. Those techniques are largely found in th
e high-level optimization phase and in the code generation phase; they are
also collectively called instruction scheduling. This paper reviews the sta
te of the art in code generation for ILP parallel processors.
Modern ILP code generation methods move code across basic block boundaries.
These methods grew out of techniques for generating horizontal microcode,
so we introduce the problem by describing its history. Most modern approach
es can be categorized by the shape of the scheduling "region." Some of thes
e regions are loops, and for those techniques known broadly as "Software Pi
pelining" are used. Software Pipelining techniques are only considered here
when there are issues relevant to the region-based techniques presented.
The selection of a type of region to use in this process is one of the most
controversial questions in code generation; the paper surveys the best kno
wn alternatives. The paper then considers two questions: First, given a typ
e of region, how does one pick specific regions of that type in the interme
diate code. In conjunction with region selection, we consider region enlarg
ement techniques such as unrolling and branch target expansion. The second
question, how does one construct a schedule once regions have been selected
, occupies the next section of the paper Finally, schedule construction usi
ng recent, innovative resource modeling based on finite-state automata is t
hen reexamined. The paper includes an extensive bibliography.