On Tue, Jun 19, 2012 at 12:19 AM, Colin Putney colin@wiresong.com wrote:
So I was poking around in Compiler today, and noticed that it's a bit... messy. I had a few ideas for improvement, but before I go monkeying with such an important part of the system, I thought I'd bring it up here. Do we have a long- or medium-term plan for how the compiler should evolve?
As I see it there are a few paths available to us:
- Incremental improvement. The compiler we have now is tried and
true. Now that we have proper block closures and a high performance VM, there's no real need to improve bytecode generation, so we shouldn't put much effort into this part of the system. We'll just make small improvements and minor refactorings as needed.
I wouldn't say that the bytecode set is satisfactory. Some things that are wrong are - limited to 256 literals - limited to 32 temporaries, 16 arguments (IIRC) - limited room for expansion (and in Newspeak there is no room for expansion)
I'd like to see a set with - a primitive bytecode to move the primitive number field out of the header and make more room for num args, num literals (64k would be a fine limit) etc. - a design based around prefixes for extended indices a la VW. lifts the limits on addressability, arity etc while keeping the bytecode compact - some mild experiments such as a nop which can be used e.g. to express metadata that guides the decompiler (this is a to:do:, etc).
Even more interesting would be metadata that allowed the discovery of inlined blocks so that e.g. mustBeBoolean is instead handled by dynamically creating closures and the relevant ifTrue:ifFalse: message so that these can be inlined for true/false but reimplemented for other objects.
- Adopt an existing project. There have been a few "new compiler"
projects over the years, and one or another of them might present an opportunity for signifiant improvement over the status quo. I'm thinking of ByteSurgeon, Opal, AOStA etc. It's not something we'll rush into, but eventually, when the code is mature, we'll want to replace the current compiler.
- Something completely new. Now that we have closures and a fast VM,
existing projects aren't relevant anymore, but we have new opportunities for improvement. VM-level changes, such as a new object format or new bytecodes could drive this option, if they're big enough that significant work on the compiler is required anyway. Maybe we can only see the broad outlines of what the project might look like at the moment, but we can see it on the horizon.
Well, my refactoring of the compiler to move instruction encoding out of ParseNode general instances and into BytecodeEncoder takes the pressure off as far as changing the bytecode set. There's still a need for refactoring in InstructionStream and CompiledMethod to handle bytecode set change. It is really a BytecodeEncoder or InstructionStream that understands how a bytecode set works, and not CompiledMethod (in e.g. readsField etc).
So, are there any pain points right now that we should think about
addressing? Is anybody planning or considering working on something compiler-related?
For me at least has to take second place to the new object representation because there's much more benefit to be derived from the object representation.
Eliot, is there anything in the new object format that will have an
impact on image-side compilation?
I don't think so. It should be entirely orthogonal.
I seem to remember you mentioning something about efficiently supporting alternate bytecode sets. Is that meant for Newspeak, or do you have something in mind for Smalltalk?
It is a convenient way of migrating the bytecode set. Better than my EncoderForLongFormV3 approach.
I don't think we have to come up with a definitive plan just now, I
just want to get a sense of what people are thinking.
Colin
On Tue, Jun 19, 2012 at 6:36 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Even more interesting would be metadata that allowed the discovery of inlined blocks so that e.g. mustBeBoolean is instead handled by dynamically creating closures and the relevant ifTrue:ifFalse: message so that these can be inlined for true/false but reimplemented for other objects.
Ah, interesting. That would be even better than the [aBlock value] pattern in VW, effectively a PIC in the bytecode.
Well, my refactoring of the compiler to move instruction encoding out of ParseNode general instances and into BytecodeEncoder takes the pressure off as far as changing the bytecode set. There's still a need for refactoring in InstructionStream and CompiledMethod to handle bytecode set change. It is really a BytecodeEncoder or InstructionStream that understands how a bytecode set works, and not CompiledMethod (in e.g. readsField etc).
Ok, it sounds to me like we're somewhere between 1. and 3. There's no need for any big architectural changes to the compiler, but some would be good. Thanks.
Colin
On 2012-06-20, at 03:36, Eliot Miranda wrote:
I wouldn't say that the bytecode set is satisfactory. Some things that are wrong are
- limited to 256 literals
- limited to 32 temporaries, 16 arguments (IIRC)
In theory 64 temps, in practice a unary method can have at most 56 temps, but that's shared with the stack, so if you are sending a message with 9 arguments it's only 46 usable temps. To overcome this we would need larger context objects. Not sure how hard-coded that is in the VM.
- limited room for expansion (and in Newspeak there is no room for expansion)
I would add:
- limited to 1k jump distance
I'd like to see a set with
- a primitive bytecode to move the primitive number field out of the header and make more room for num args, num literals (64k would be a fine limit) etc.
- a design based around prefixes for extended indices a la VW. lifts the limits on addressability, arity etc while keeping the bytecode compact
- some mild experiments such as a nop which can be used e.g. to express metadata that guides the decompiler (this is a to:do:, etc).
I have run into these hard limits, too. My idea was to hack the compiler on the image side using the current byte code set. E.g., if there are more than 255 literals, use the 256th as literal array and generate code that pulls the literal out of that array (and if needed performs the send). Similarly for temps.
Extending the jump distance is harder. One could imagine "jump pads" sprinkled throughout the compiled method so there would be several hops for one jump. Seems to be hairy though. Or possibly one could compile the blocks that exceed the max distance as actual blocks + message sends.
This would allow large methods to compile, but methods exceeding the limits would be slow. For the generated code where I needed it that would have been fine, and saved considerable effort (I had to change the code generation to break up large methods into smaller ones, with temps spilling over into instance variables, and thisContext magic to emulate local returns).
So if someone wants to have fun with the compiler, working around (some of) these limits would be great. And if we get a new bytecode set, the code would become faster, too.
- Bert -
On Wed, Jun 20, 2012 at 3:36 AM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Tue, Jun 19, 2012 at 12:19 AM, Colin Putney colin@wiresong.com wrote:
So I was poking around in Compiler today, and noticed that it's a bit... messy. I had a few ideas for improvement, but before I go monkeying with such an important part of the system, I thought I'd bring it up here. Do we have a long- or medium-term plan for how the compiler should evolve?
As I see it there are a few paths available to us:
- Incremental improvement. The compiler we have now is tried and
true. Now that we have proper block closures and a high performance VM, there's no real need to improve bytecode generation, so we shouldn't put much effort into this part of the system. We'll just make small improvements and minor refactorings as needed.
I wouldn't say that the bytecode set is satisfactory. Some things that are wrong are
- limited to 256 literals
- limited to 32 temporaries, 16 arguments (IIRC)
- limited room for expansion (and in Newspeak there is no room for
expansion)
I'd like to see a set with
- a primitive bytecode to move the primitive number field out of the
header and make more room for num args, num literals (64k would be a fine limit) etc.
Is it not worth to also have space to store the endPC?
- a design based around prefixes for extended indices a la VW. lifts the
limits on addressability, arity etc while keeping the bytecode compact
- some mild experiments such as a nop which can be used e.g. to express
metadata that guides the decompiler (this is a to:do:, etc).
Even more interesting would be metadata that allowed the discovery of inlined blocks so that e.g. mustBeBoolean is instead handled by dynamically creating closures and the relevant ifTrue:ifFalse: message so that these can be inlined for true/false but reimplemented for other objects.
That would be lovely for proxies that proxify booleans.
Cheers,
vm-dev@lists.squeakfoundation.org