Anthony Hannan writes:
I know past Jitters did not require a format change. But since Ian is on Eliot's Bytecode-to-Bytecode Adaptive Optimization for Smalltalk project (along with myself, Dan, Marcus, and John Sarkela) I suspect the next Jitter will be a Smalltalk Jitter producing optimized bytecodes. Some of these bytecodes will be new special low-level bytecodes, which will require a new image format. Even if Ian produces a C Jitter before that with no format change, it will speed things up enough to not need VI4. Only the compiled method format change will be missing, which again is not important.
I'm not sure if a Jitter using Eliot's work will require a new image format. It does need special low level bytecodes but that could be provided by another bytecode set. It doesn't require changing the original bytecodes just the generated low level ones. Think of the low-level bytecodes as an intermediate language describing the boundary between Smalltalk with the compiler and the optimizer, and the VM with the code generation.
There are a few things that would be nice purely for performance in an image change. Having a tag bit of 0 rather than 1 for integers would shave 3 instructions off simple arithmetic taking it down to 5 instructions on an x86. Whether this is worth the bother would require a little analysis and probably playing with some basic optimization's, like those Ian Piumarta describes in "J3 for Squeak." Simple optimization across bytecodes should be able to remove a lot of needless tagging and untagging.
http://www-sor.inria.fr/~piumarta/squeak/unix/zip/j3-2.6.0/doc/j3/
The cost of optimizing across bytecodes is loosing synchronization points. It is no longer simple to reenter the method at any byte code. The debugger could have single stepped into an "intermediate" position so theoretically any point is a reentry point for a method. This does inhibit optimization, there are various solutions, but I'm delaying thinking more about it until I have more experience.
So I believe we can live with the 15% slow down for a year or so. If not we can try to speed up the interpreter a little by just including stack enhancements but not bytecodes enhancements so we don't change the image format. I think any bytecode changes should be in conjunction with Eliot's project or some Jitter project. And to reiterate, I think the compiled method format change should only be include on the back of some other more significant image format change like bytecodes.
Is it possible to speed proper block closures up using less drastic measures? The commercial Smalltalk's have had this problem for some years now. I remember reading some papers on the subject but can't remember the details. Most blocks are probably simple and can be handled as a special case quickly. A compiler could easily spot if the block returned or accessed the method's variables.
Personally, a 15% slow down seems reasonable for proper block closures. Especially if there are several different ways to regain the speed where some speed-ups are relatively simple to implement.
Bryce