The Java Virtual Machine (JVM) is the cornerstone of Java technology and it
s efficiency in executing the portable Java bytecodes is crucial for the su
ccess of this technology. Interpretation, Just-in-Time (JIT) compilation, a
nd hardware realization are well-known solutions for a JVM and previous res
earch has proposed optimizations for each of these techniques. However, eac
h technique has its pros and cons and may not be uniformly attractive for a
ll hardware platforms, Instead, an understanding of the architectural impli
cations of JVM implementations with real applications can be crucial to the
development of enabling technologies for efficient Java runtime system dev
elopment on a wide range of platforms. Toward this goal, this paper examine
s architectural issues from both the hardware and JVM implementation perspe
ctives. The paper starts by identifying the important execution characteris
tics of Java applications from a bytecode perspective. It then explores the
potential of a smart JIT compiler strategy that can dynamically interpret
or compile based on associated costs and investigates the CPU and cache arc
hitectural support that would benefit JVM implementations. We also study th
e available parallelism during the different execution modes using applicat
ions from the SPECjvm98 benchmarks. At the bytecode level. it is observed t
hat less than 45 out of the 256 bytecodes constitute 90 percent of the dyna
mic bytecode stream. Method sizes fall into a trinodal distribution with pe
aks of 1, 9, and 26 bytecodes across all benchmarks. The architectural issu
es explored in this study show that. when Java applications are executed wi
th a JIT compiler, selective translation using good heuristics can improve
performance, but the saving is only 10-15 percent at best. The instruction
and data cache performance of Java applications are seen to be better than
that of C/C++ applications except in the case of data cache performance in
the JIT mode. Write misses resulting from installation of JIT compiler outp
ut dominate the misses and deteriorate the data cache performance in JIT mo
de. A study on the available parallelism shows that Java programs executed
using JIT compilers have parallelism comparable to C/C++ programs for small
window sizes, but falls behind when the window size is increased. Java pro
grams executed using the interpreter have very little parallelism due to th
e stack nature of the JVM instruction set, which is dominant in the interpr
eted execution mode. In addition, this work gives revealing insights and ar
chitectural proposals for designing an efficient Java runtime system.