VI4 (was: RE: [ANN]Draft rough plan for 3.6!)

15 Apr 2003


      Anthony Hannan writes:
...
I know past Jitters did not require a format change.  But since Ian is
on Eliot's Bytecode-to-Bytecode Adaptive Optimization for Smalltalk
project (along with myself, Dan, Marcus, and John Sarkela) I suspect the
next Jitter will be a Smalltalk Jitter producing optimized bytecodes. 
Some of these bytecodes will be new special low-level bytecodes, which
will require a new image format.  Even if Ian produces a C Jitter before
that with no format change, it will speed things up enough to not need
VI4.  Only the compiled method format change will be missing, which
again is not important.
I'm not sure if a Jitter using Eliot's work will require a new image
format. It does need special low level bytecodes but that could be
provided by another bytecode set. It doesn't require changing the
original bytecodes just the generated low level ones. Think of the
low-level bytecodes as an intermediate language describing the
boundary between Smalltalk with the compiler and the optimizer, and
the VM with the code generation.
There are a few things that would be nice purely for performance in an
image change. Having a tag bit of 0 rather than 1 for integers would
shave 3 instructions off simple arithmetic taking it down to 5
instructions on an x86. Whether this is worth the bother would require
a little analysis and probably playing with some basic optimization's,
like those Ian Piumarta describes in "J3 for Squeak." Simple
optimization across bytecodes should be able to remove a lot of
needless tagging and untagging.
http://www-sor.inria.fr/~piumarta/squeak/unix/zip/j3-2.6.0/doc/j3/
The cost of optimizing across bytecodes is loosing synchronization
points. It is no longer simple to reenter the method at any byte
code. The debugger could have single stepped into an "intermediate"
position so theoretically any point is a reentry point for a method.
This does inhibit optimization, there are various solutions, but I'm
delaying thinking more about it until I have more experience.
...
So I believe we can live with the 15% slow down for a year or so.  If
not we can try to speed up the interpreter a little by just including
stack enhancements but not bytecodes enhancements so we don't change the
image format.  I think any bytecode changes should be in conjunction
with Eliot's project or some Jitter project.  And to reiterate, I think
the compiled method format change should only be include on the back of
some other more significant image format change like bytecodes.
Is it possible to speed proper block closures up using less drastic
measures? The commercial Smalltalk's have had this problem for some
years now. I remember reading some papers on the subject but can't
remember the details. Most blocks are probably simple and can be
handled as a special case quickly. A compiler could easily spot if the
block returned or accessed the method's variables.
Personally, a 15% slow down seems reasonable for proper block
closures. Especially if there are several different ways to regain the
speed where some speed-ups are relatively simple to implement.
Bryce