Folks should try the fix and see
Ihttp://idisk.mac.com/bertfreudenberg-Public/temp
Double-click EtoysToGo.app and drop in wendy.pr for it to explode, well or not after the fix is applied.
-- = = = ======================================================================== John M. McIntosh johnmci@smalltalkconsulting.com Twitter: squeaker68882 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ========================================================================
On Tue, May 05, 2009 at 11:37:43AM -0700, John M McIntosh wrote:
Folks should try the fix and see
Ihttp://idisk.mac.com/bertfreudenberg-Public/temp
Double-click EtoysToGo.app and drop in wendy.pr for it to explode, well or not after the fix is applied.
I did not try the exploding wendy.pr test, but your ClosureVMPopKiller-M7349 definitely fixes some stack balance bugs, so I loaded it in the VMMaker project on SqS in VMMaker-dtl.122.
The Mantis 7349 issue is marked as status "testing" since I did not actually perform the wendy.pr test. I'll move it to "resolved" next week if no one cites evidence to the contrary.
r.e. your notes in Mantis 7349:
was there not some recovery code for unbalanced stacks somewhere? My concern is there other example of this which we not yet crashed over. So how would we fix the VM to avoid? Or do we need to check all the plugin prim code for coding issues.
Yes there are sure to be more unbalanced stack bugs, and yes somebody should probably check all the plugin code. Most likely nobody will get around to doing that but no worries, the stack VM seems to do an excellent job of finding these bugs :)
Dave
On 06.05.2009, at 02:36, David T. Lewis wrote:
On Tue, May 05, 2009 at 11:37:43AM -0700, John M McIntosh wrote:
Folks should try the fix and see
Ihttp://idisk.mac.com/bertfreudenberg-Public/temp
Double-click EtoysToGo.app and drop in wendy.pr for it to explode, well or not after the fix is applied.
I did not try the exploding wendy.pr test,
Well what I uploaded there is just our not-yet-finished "Etoys To Go" project that's supposed to be runnable from an USB flash drive, on any platform. The VMs for Win and Linux there are not yet updated. Etoys actually does not need closure support (yet) but a feature to resolve relative directories (so the user data will also land on the flash drive). John put that feature in his 4.0 series which supports closures, so I needed to use that one. And then I noticed it blows up on entering sandbox mode.
but your ClosureVMPopKiller-M7349 definitely fixes some stack balance bugs, so I loaded it in the VMMaker project on SqS in VMMaker-dtl.122.
The Mantis 7349 issue is marked as status "testing" since I did not actually perform the wendy.pr test. I'll move it to "resolved" next week if no one cites evidence to the contrary.
r.e. your notes in Mantis 7349:
was there not some recovery code for unbalanced stacks somewhere? My concern is there other example of this which we not yet crashed over. So how would we fix the VM to avoid? Or do we need to check all the plugin prim code for coding issues.
Yes there are sure to be more unbalanced stack bugs, and yes somebody should probably check all the plugin code. Most likely nobody will get around to doing that but no worries, the stack VM seems to do an excellent job of finding these bugs :)
How costly would it be to always do this:
interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
or
interpreterProxy pop: interpreterProxy methodArgumentCount
to return self?
- Bert -
On Wed, May 06, 2009 at 12:34:27PM +0200, Bert Freudenberg wrote:
On 06.05.2009, at 02:36, David T. Lewis wrote:
Yes there are sure to be more unbalanced stack bugs, and yes somebody should probably check all the plugin code. Most likely nobody will get around to doing that but no worries, the stack VM seems to do an excellent job of finding these bugs :)
How costly would it be to always do this:
interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
or
interpreterProxy pop: interpreterProxy methodArgumentCount
to return self?
The main concern seems to be existing bugs in the code base that are just now being exposed. I don't know of any easy way out; as far as I know you have to either inspect the code manually, or wait for it to fail at run time.
Note, it is common in many plugins to pop things off the stack early in the method, so you cannot just patch the last line of the method. You need to check senders to determine the expected argument count, then make sure that the plugin matches this.
SmartSyntaxInterpreterPlugin automates much of this, so it's less of a concern for plugins that use it. Also, I think that Eliot has some ideas for making the problem go away entirely.
Dave
David T. Lewis wrote:
The main concern seems to be existing bugs in the code base that are just now being exposed. I don't know of any easy way out; as far as I know you have to either inspect the code manually, or wait for it to fail at run time.
Note, it is common in many plugins to pop things off the stack early in the method, so you cannot just patch the last line of the method. You need to check senders to determine the expected argument count, then make sure that the plugin matches this.
SmartSyntaxInterpreterPlugin automates much of this, so it's less of a concern for plugins that use it. Also, I think that Eliot has some ideas for making the problem go away entirely.
We changed our VMs to ignore push/pop requests from plugins and rather have the VM do the management of arguments. The VM interprets popXYZ as an access of method argument n-x and pushXYZ as an implicit return from a method and then pops the "right" number of args regardless of what the plugin thinks it requested. We also added more convenient (lef-to-right) argument accessors. This has worked very well for us.
One thing we could do is for the next generation of VMs which have a new image format won't be referred to as Squeak 3x etc. to up the interpreter proxy major number and use only that mechanism. Thoughts?
Cheers, - Andreas
On Wed, May 6, 2009 at 8:28 AM, Andreas Raab andreas.raab@gmx.de wrote:
David T. Lewis wrote:
The main concern seems to be existing bugs in the code base that are just now being exposed. I don't know of any easy way out; as far as I know you have to either inspect the code manually, or wait for it to fail at run time.
Note, it is common in many plugins to pop things off the stack early in the method, so you cannot just patch the last line of the method. You need to check senders to determine the expected argument count, then make sure that the plugin matches this.
SmartSyntaxInterpreterPlugin automates much of this, so it's less of a concern for plugins that use it. Also, I think that Eliot has some ideas for making the problem go away entirely.
We changed our VMs to ignore push/pop requests from plugins and rather have the VM do the management of arguments. The VM interprets popXYZ as an access of method argument n-x and pushXYZ as an implicit return from a method and then pops the "right" number of args regardless of what the plugin thinks it requested. We also added more convenient (lef-to-right) argument accessors. This has worked very well for us.
Yes, we support this in the Stack VM. But this is sloooow, we're not actually using it yet, and I'm not supporting this for Cog. I'm relying merely on the simple stack balance check which has also worked and is much faster. This is only an error-checking step and doesnt have much benefit in production, whereas eliminating the bug in the first place by eliminating explicit stack manipulation in primitives is the way to go.
One thing we could do is for the next generation of VMs which have a new image format won't be referred to as Squeak 3x etc. to up the interpreter proxy major number and use only that mechanism. Thoughts?
That a smart syntax approach is the way to go. That it may be much better to duplicate primitives than support varargs (very few primitives need varargs) but that varargs can be supported somehow, e.g.
myVarArgClass: numArgs receiver: receiver with: firstArg <varargs> numArgs = 0 ifTrue: [^self fetchClassOf: receiver]. numArgs = 1 ifTrue: [^self fetchClassOf: firstArg]. ^self primitiveFailFor: BadNumArgs
which would be both the class primitive and a "thisContext mirrorClassOf: something" primitive as the current one is: primitiveClass | instance | instance := self stackTop. self pop: argumentCount+1 thenPush: (self fetchClassOf: instance)
That one makes the common case go fast as long as errors are apparent. So have as streamlined a mechanism that can still spot incorrect argument counts as possible, and hence that the best way is to make it impossible to get the argument count wrong in the first place by eliminating explicit stack manipulation in primitives.
Cheers,
- Andreas
2009/5/6 Eliot Miranda eliot.miranda@gmail.com:
On Wed, May 6, 2009 at 8:28 AM, Andreas Raab andreas.raab@gmx.de wrote:
David T. Lewis wrote:
The main concern seems to be existing bugs in the code base that are just now being exposed. I don't know of any easy way out; as far as I know you have to either inspect the code manually, or wait for it to fail at run time.
Note, it is common in many plugins to pop things off the stack early in the method, so you cannot just patch the last line of the method. You need to check senders to determine the expected argument count, then make sure that the plugin matches this.
SmartSyntaxInterpreterPlugin automates much of this, so it's less of a concern for plugins that use it. Also, I think that Eliot has some ideas for making the problem go away entirely.
We changed our VMs to ignore push/pop requests from plugins and rather have the VM do the management of arguments. The VM interprets popXYZ as an access of method argument n-x and pushXYZ as an implicit return from a method and then pops the "right" number of args regardless of what the plugin thinks it requested. We also added more convenient (lef-to-right) argument accessors. This has worked very well for us.
Yes, we support this in the Stack VM. But this is sloooow, we're not actually using it yet, and I'm not supporting this for Cog. I'm relying merely on the simple stack balance check which has also worked and is much faster. This is only an error-checking step and doesnt have much benefit in production, whereas eliminating the bug in the first place by eliminating explicit stack manipulation in primitives is the way to go.
Except primitives who expecting any number of arguments, like BlockClosure>>value. But this is rather an exception than a common case.
One thing we could do is for the next generation of VMs which have a new image format won't be referred to as Squeak 3x etc. to up the interpreter proxy major number and use only that mechanism. Thoughts?
That a smart syntax approach is the way to go. That it may be much better to duplicate primitives than support varargs (very few primitives need varargs) but that varargs can be supported somehow, e.g. myVarArgClass: numArgs receiver: receiver with: firstArg <varargs> numArgs = 0 ifTrue: [^self fetchClassOf: receiver]. numArgs = 1 ifTrue: [^self fetchClassOf: firstArg]. ^self primitiveFailFor: BadNumArgs which would be both the class primitive and a "thisContext mirrorClassOf: something" primitive as the current one is: primitiveClass | instance | instance := self stackTop. self pop: argumentCount+1 thenPush: (self fetchClassOf: instance)
That one makes the common case go fast as long as errors are apparent. So have as streamlined a mechanism that can still spot incorrect argument counts as possible, and hence that the best way is to make it impossible to get the argument count wrong in the first place by eliminating explicit stack manipulation in primitives.
Cheers, - Andreas
On Wed, May 06, 2009 at 08:28:13AM -0700, Andreas Raab wrote:
One thing we could do is for the next generation of VMs which have a new image format won't be referred to as Squeak 3x etc. to up the interpreter proxy major number and use only that mechanism. Thoughts?
I don't have a clear idea of what the roadmap to next generation VM's looks like. We have a lot of interesting projects and new ideas (Hydra, iPhone, Cog, Exupery, Croquet, ...).
I have a general sense that it is the responsibility of the image to identify its image format (hence what it requires of a VM) by means of an imageFormatVersionNumber known to the image and saved in the file header when saving the image. And it is the responsibility of a VM to know its capabilities in order to determine if it can interpret an image of some declared imageFormatVersionNumber.
My guess would be that yes, we will soon be due for a new interpreter proxy major number, but no that mechanism is not sufficient to identify the next generation VM(s). Specifically, knowing the major/minor interpreter proxy numbers will not tell you if a VM has the necessary capabilities to support the requirements of an image with a given imageFormatVersionNumber.
Dave
2009/5/7 David T. Lewis lewis@mail.msen.com:
On Wed, May 06, 2009 at 08:28:13AM -0700, Andreas Raab wrote:
One thing we could do is for the next generation of VMs which have a new image format won't be referred to as Squeak 3x etc. to up the interpreter proxy major number and use only that mechanism. Thoughts?
I don't have a clear idea of what the roadmap to next generation VM's looks like. We have a lot of interesting projects and new ideas (Hydra, iPhone, Cog, Exupery, Croquet, ...).
I have a general sense that it is the responsibility of the image to identify its image format (hence what it requires of a VM) by means of an imageFormatVersionNumber known to the image and saved in the file header when saving the image. And it is the responsibility of a VM to know its capabilities in order to determine if it can interpret an image of some declared imageFormatVersionNumber.
My guess would be that yes, we will soon be due for a new interpreter proxy major number, but no that mechanism is not sufficient to identify the next generation VM(s). Specifically, knowing the major/minor interpreter proxy numbers will not tell you if a VM has the necessary capabilities to support the requirements of an image with a given imageFormatVersionNumber.
Right it doesn't. The interpreterProxy version numbers could only serve for indication, what version of external API is supported by VM , or not. But in Hydra, i eliminated that too. Interpreterproxy structure v.2.1 contains only 3 function pointers:
sqInt (*minorVersion)(void); sqInt (*majorVersion)(void);
/* IMPORTANT!!! * The rest of functions can be obtained by plugin by calling a getVMFunctionPointerBySelector function. * The need in defining additional functions in this struct is gone forever */ void * (*getVMFunctionPointerBySelector)(char * selector);
as you see, in case of Hydra, going past v.2.1. is quite pointless. :)
Dave
Igor Stasenko wrote:
Interpreterproxy structure v.2.1 contains only 3 function pointers:
sqInt (*minorVersion)(void); sqInt (*majorVersion)(void);
/* IMPORTANT!!!
- The rest of functions can be obtained by plugin by calling a
getVMFunctionPointerBySelector function.
- The need in defining additional functions in this struct is gone forever
*/ void * (*getVMFunctionPointerBySelector)(char * selector);
as you see, in case of Hydra, going past v.2.1. is quite pointless. :)
No, you're completely missing the point of the version identifier. It has two roles: One is to identify which set of functions a plugin can expect which allows us to provide compatibility functions since the proxy interface is documented. So one very good reason is documentation.
Second, there is a higher level notion of whether something is compatible or not - for example the return value from certain functions change in a 64 bit image accordingly (I don't even know how a 32 bit plugin is prevented from interacting with a 64 bit image today). Sometimes you really need a high-level bit that tells you that the world has changed even if the names stay the same.
It makes absolutely no sense to say "oh, we'll all just look it up and then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Cheers, - Andreas
2009/5/7 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Interpreterproxy structure v.2.1 contains only 3 function pointers:
sqInt (*minorVersion)(void); sqInt (*majorVersion)(void);
/* IMPORTANT!!!
- The rest of functions can be obtained by plugin by calling a
getVMFunctionPointerBySelector function.
- The need in defining additional functions in this struct is gone
forever */ void * (*getVMFunctionPointerBySelector)(char * selector);
as you see, in case of Hydra, going past v.2.1. is quite pointless. :)
No, you're completely missing the point of the version identifier. It has two roles: One is to identify which set of functions a plugin can expect which allows us to provide compatibility functions since the proxy interface is documented. So one very good reason is documentation.
Second, there is a higher level notion of whether something is compatible or not - for example the return value from certain functions change in a 64 bit image accordingly (I don't even know how a 32 bit plugin is prevented from interacting with a 64 bit image today). Sometimes you really need a high-level bit that tells you that the world has changed even if the names stay the same.
this can be solved simply: add a function with 32 bit value, which could answer is VM 32 bit, or 64 bit.
It makes absolutely no sense to say "oh, we'll all just look it up and then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Cheers, - Andreas
Igor Stasenko wrote:
It makes absolutely no sense to say "oh, we'll all just look it up and then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Yes, OpenGL is a great example. Because what the OpenGL consortium does is moving extensions into core functionality, increasing the version number of OpenGL so that clients know they can rely on a documented and stable API. Exactly my point.
I have no problems with a named lookup mechanism in addition to a core interface. In fact we have one, it's ioLoadFunctionFrom. It solves a useful problem, namely that of how to support entry points where you don't know whether they will be available in the future or if it's only for use in this one version. But it's not a replacement for a documented core API.
Cheers, - Andreas
On 07.05.2009, at 17:38, Andreas Raab wrote:
Igor Stasenko wrote:
It makes absolutely no sense to say "oh, we'll all just look it up and then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Yes, OpenGL is a great example. Because what the OpenGL consortium does is moving extensions into core functionality, increasing the version number of OpenGL so that clients know they can rely on a documented and stable API. Exactly my point.
I have no problems with a named lookup mechanism in addition to a core interface. In fact we have one, it's ioLoadFunctionFrom. It solves a useful problem, namely that of how to support entry points where you don't know whether they will be available in the future or if it's only for use in this one version. But it's not a replacement for a documented core API.
+1
But back to the original question - do we want to change the plugin API for 4.0 VMs?
- Bert -
On Thu, May 7, 2009 at 8:44 AM, Bert Freudenberg bert@freudenbergs.dewrote:
On 07.05.2009, at 17:38, Andreas Raab wrote:
Igor Stasenko wrote:
It makes absolutely no sense to say "oh, we'll all just look it up and
then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating
the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Yes, OpenGL is a great example. Because what the OpenGL consortium does is moving extensions into core functionality, increasing the version number of OpenGL so that clients know they can rely on a documented and stable API. Exactly my point.
I have no problems with a named lookup mechanism in addition to a core interface. In fact we have one, it's ioLoadFunctionFrom. It solves a useful problem, namely that of how to support entry points where you don't know whether they will be available in the future or if it's only for use in this one version. But it's not a replacement for a documented core API.
+1
But back to the original question - do we want to change the plugin API for 4.0 VMs?
Yes. I propose the following changes:
- change the interface to the API to one based on function pointers, not interpreterProxy. The function pointers are initialized in a plugin's setInterpreter: which does the traditional compatibility check and then fetches the function pointers from the argument (exactly how we can discuss further).
- remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as parameters, returning the result on success and 0 (not SmallInteger 0) on failure
- eliminate pushRemappableOop/popRemappableOop and make the allocation interface one that can fail, so when memory runs out a primitive will return an out-of-memory error code
- use primitive error codes where appropriate. e.g. in a plugin ^self primitiveFailFor: PrimErrNoMem is equivalent to primErrorCode = 0 ifTrue: [primErrCode := PrimErrNoMem]. ^nil
- provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects. Use isCharacterObject: when the objective is to select a character. I intend to add immediate characters within the next few months.
- Bert -
2009/5/7 Eliot Miranda eliot.miranda@gmail.com:
On Thu, May 7, 2009 at 8:44 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 07.05.2009, at 17:38, Andreas Raab wrote:
Igor Stasenko wrote:
It makes absolutely no sense to say "oh, we'll all just look it up and then somehow it's going to magically work". Version information is *critical* if you want to play things together in the long term - it has allowed us to have a very smooth ride for a very long time in this area and I wish Squeak would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Yes, OpenGL is a great example. Because what the OpenGL consortium does is moving extensions into core functionality, increasing the version number of OpenGL so that clients know they can rely on a documented and stable API. Exactly my point.
I have no problems with a named lookup mechanism in addition to a core interface. In fact we have one, it's ioLoadFunctionFrom. It solves a useful problem, namely that of how to support entry points where you don't know whether they will be available in the future or if it's only for use in this one version. But it's not a replacement for a documented core API.
+1
But back to the original question - do we want to change the plugin API for 4.0 VMs?
Yes. I propose the following changes:
- change the interface to the API to one based on function pointers, not interpreterProxy. The function pointers are initialized in a plugin's setInterpreter: which does the traditional compatibility check and then fetches the function pointers from the argument (exactly how we can discuss further).
- remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as parameters, returning the result on success and 0 (not SmallInteger 0) on failure
- eliminate pushRemappableOop/popRemappableOop and make the allocation interface one that can fail, so when memory runs out a primitive will return an out-of-memory error code
- use primitive error codes where appropriate. e.g. in a plugin
^self primitiveFailFor: PrimErrNoMem is equivalent to primErrorCode = 0 ifTrue: [primErrCode := PrimErrNoMem]. ^nil
- provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects. Use isCharacterObject: when the objective is to select a character. I intend to add immediate characters within the next few months.
so, does that means that you will extend the oop tag to 2 bits (or more)? Or just reserve a non-movable heap space for character objects, like: isCharacterObject: oop ^ oop >= charsStart and: [ oop < charsEnd ]
- Bert -
On Thu, May 7, 2009 at 9:36 AM, Igor Stasenko siguctua@gmail.com wrote:
2009/5/7 Eliot Miranda eliot.miranda@gmail.com:
On Thu, May 7, 2009 at 8:44 AM, Bert Freudenberg bert@freudenbergs.de
wrote:
On 07.05.2009, at 17:38, Andreas Raab wrote:
Igor Stasenko wrote:
It makes absolutely no sense to say "oh, we'll all just look it up
and then
somehow it's going to magically work". Version information is
*critical* if
you want to play things together in the long term - it has allowed us
to
have a very smooth ride for a very long time in this area and I wish
Squeak
would have more of that in general.
Taking a function pointer by name, is nothing more than enumerating the VM capabilities. Take a look at OpenGL extension mechanism. Do they have to change the version of OpenGL each time they want to add new functionality? No. You can simply ask the library about support of certain capability - and depending on answer decide what to do.
Yes, OpenGL is a great example. Because what the OpenGL consortium does
is moving extensions into core functionality, increasing the version number of OpenGL so that clients know they can rely on a documented and stable API. Exactly my point.
I have no problems with a named lookup mechanism in addition to a core
interface. In fact we have one, it's ioLoadFunctionFrom. It solves a useful problem, namely that of how to support entry points where you don't know whether they will be available in the future or if it's only for use in this one version. But it's not a replacement for a documented core API.
+1
But back to the original question - do we want to change the plugin API
for 4.0 VMs?
Yes. I propose the following changes:
- change the interface to the API to one based on function pointers, not
interpreterProxy. The function pointers are initialized in a plugin's setInterpreter: which does the traditional compatibility check and then fetches the function pointers from the argument (exactly how we can discuss further).
- remove stack access from the API, writing them as SmartSyntaxPlugins
where arguments are passed in as parameters, returning the result on success and 0 (not SmallInteger 0) on failure
- eliminate pushRemappableOop/popRemappableOop and make the allocation
interface one that can fail, so when memory runs out a primitive will return an out-of-memory error code
- use primitive error codes where appropriate. e.g. in a plugin ^self primitiveFailFor: PrimErrNoMem
is equivalent to primErrorCode = 0 ifTrue: [primErrCode := PrimErrNoMem]. ^nil
- provide isImmediateObject: and use it in place of isIntegerObject: when
the objective is to select heap objects. Use isCharacterObject: when the objective is to select a character. I intend to add immediate characters within the next few months.
so, does that means that you will extend the oop tag to 2 bits (or more)?
Yes. Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters. Andreas wrote a thorough sketch of this schemehttp://lists.squeakfoundation.org/pipermail/vm-dev/2006-January/000429.htmlin 2006.
Or just reserve a non-movable heap space for character objects, like: isCharacterObject: oop ^ oop >= charsStart and: [ oop < charsEnd ]
No. This doesn't scale to unicode. The tagged approach provides much faster string access, and identity comparison for all characters, not just the byte range.
- Bert -
-- Best regards, Igor Stasenko AKA sig.
On 07.05.2009, at 18:53, Eliot Miranda wrote:
On Thu, May 7, 2009 at 9:36 AM, Igor Stasenko siguctua@gmail.com wrote:
2009/5/7 Eliot Miranda eliot.miranda@gmail.com:
I intend to add immediate characters within the next few months.
so, does that means that you will extend the oop tag to 2 bits (or more)?
Yes. Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters. Andreas wrote a thorough sketch of this scheme in 2006.
Or just reserve a non-movable heap space for character objects, like: isCharacterObject: oop ^ oop >= charsStart and: [ oop < charsEnd ]
No. This doesn't scale to unicode. The tagged approach provides much faster string access, and identity comparison for all characters, not just the byte range.
Do we have evidence that Character allocation is an actual performance bottleneck?
- Bert -
On Thu, May 7, 2009 at 10:11 AM, Bert Freudenberg bert@freudenbergs.dewrote:
On 07.05.2009, at 18:53, Eliot Miranda wrote:
On Thu, May 7, 2009 at 9:36 AM, Igor Stasenko siguctua@gmail.com wrote:
2009/5/7 Eliot Miranda eliot.miranda@gmail.com:
I intend to add immediate characters within the next few months.
so, does that means that you will extend the oop tag to 2 bits (or more)?
Yes. Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters. Andreas wrote a thorough sketch of this schemehttp://lists.squeakfoundation.org/pipermail/vm-dev/2006-January/000429.htmlin 2006.
Or just reserve a non-movable heap space for character objects, like: isCharacterObject: oop ^ oop >= charsStart and: [ oop < charsEnd ]
No. This doesn't scale to unicode. The tagged approach provides much faster string access, and identity comparison for all characters, not just the byte range.
Do we have evidence that Character allocation is an actual performance bottleneck?
The problem is not so much character allocation because by far the most character access in e.g. IDE usage is with byte characters. The problem is character indirection. To assign a character to a string in ByteString>>at:put: requires indirecting through the character to extract the character code. To answer a character in ByteString>>at: involves indirecting through the specialObjectsOop to fetch the characterTable and indexing the character table with the byte character code. But allocation is very slow in Squeak so for Unicode the problem for at: is much worse because we also have to allocate the result box.
This is very slow compared to merely adding/removing a tag bit. As for evidence as to whether this is a bottle-neck it is obscured by plugin primitives that mitigate the effects, such as primitiveFindSubstring. In general these are a bad idea because they're hard to debug, effectively impossible to change and are not polymorphic. But take a look at the following, which uses identityIndexOf: to avoid primitive machinery (but is no slower since invoking the primitive machinery is in itself expensive)
| wa ws ba bs bc wc n | ba := (120 to: 125) collect: [:cc| Character value: cc]. bs := ba asString. bc := ba last. wa := (12345 to: 12350) collect: [:cc| Character value: cc]. ws := wa asString. wc := wa last. n := 1000000. { Time millisecondsToRun: [1 to: n do: [:ign| ba identityIndexOf: bc ifAbsent: 0]]. Time millisecondsToRun: [1 to: n do: [:ign| bs identityIndexOf: bc ifAbsent: 0]]. Time millisecondsToRun: [1 to: n do: [:ign| wa identityIndexOf: wc ifAbsent: 0]]. Time millisecondsToRun: [1 to: n do: [:ign| ws identityIndexOf: wc ifAbsent: 0]] } Squeak 4.0 beta1 Closure: #(448 1058 451 8631) Stack VM: #(581 1531 579 7303) Cog: #(214 600 213 1970)
So string access is two to three times slower than array access when the result is fetched from the character table and an order of magnitude worse when the result must be boxed. (I don't know why the Stack VM is slower than Squeak 4 for non-boxed access; I probably need to do a merge :) ).
- Bert -
Bert Freudenberg wrote:
Yes. Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters. Andreas wrote a thorough sketch of this scheme in 2006.
I would like to see this scheme get adopted. My own suggestion for the spare encoding (spare if the GC is changed, that is) was for unboxed 30 bit floats, but that seems to be unpopular and I guess I can live with float arrays instead.
Certainly immediate charaters are very important as we move away from ASCII, and I liked Andreas' suggestions of colors and short points as well. But I would be particularly interested in having an immediate encoding for symbols. You already have a global table from converting to/from strings, so I see no need to store anything in the symbols themselves. Having a trivial way to know that an object is a symbol without chasing a class pointer could make serializing/restoring objects a little faster.
-- Jecel
On 08.05.2009, at 01:13, Jecel Assumpcao Jr wrote:
Bert Freudenberg wrote:
Yes. Keep 31-bit SmallIntegers, provide e.g. 24-bit immediate characters. Andreas wrote a thorough sketch of this scheme in 2006.
I would like to see this scheme get adopted. My own suggestion for the spare encoding (spare if the GC is changed, that is) was for unboxed 30 bit floats, but that seems to be unpopular and I guess I can live with float arrays instead.
Certainly immediate charaters are very important as we move away from ASCII, and I liked Andreas' suggestions of colors and short points as well. But I would be particularly interested in having an immediate encoding for symbols. You already have a global table from converting to/from strings, so I see no need to store anything in the symbols themselves. Having a trivial way to know that an object is a symbol without chasing a class pointer could make serializing/restoring objects a little faster.
-- Jecel
The quote above was Eliot's, not mine.
Having immediate Floats seems rather compelling to me, too. Or immediate doubles, which might be a good reason to want a 64-bit image.
- Bert -
At Thu, 7 May 2009 09:04:54 -0700, Eliot Miranda wrote:
- remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as parameters,
returning the result on success and 0 (not SmallInteger 0) on failure
In these days, nobody would care much about it, but this would make it harder to simulate a platform independent performance primitive in the image?
- provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects. Use
isCharacterObject: when the objective is to select a character. I intend to add immediate characters within the next few months.
Are you going to use UTF-32 or UTF-16 for it?
-- Yoshiki
On Thu, May 7, 2009 at 10:45 AM, Yoshiki Ohshima yoshiki@vpri.org wrote:
At Thu, 7 May 2009 09:04:54 -0700, Eliot Miranda wrote:
- remove stack access from the API, writing them as SmartSyntaxPlugins
where arguments are passed in as parameters,
returning the result on success and 0 (not SmallInteger 0) on failure
In these days, nobody would care much about it, but this would make it harder to simulate a platform independent performance primitive in the image?
I don't think it makes any difference. In the simulator the VM could e.g. use perform:withArguments: to invoke the primitive. The real VM needs to do something similar and have glue to the platform's native calling convention, which can be as simple as a 32-element switch statement: switch (numArgs) { case 0: result = primitiveFunctionPointer(stackTop()); break; case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break; ... or as sophisticated as machine code generated on the fly.
- provide isImmediateObject: and use it in place of isIntegerObject: when
the objective is to select heap objects. Use
isCharacterObject: when the objective is to select a character. I intend to add immediate characters within the next few months.
Are you going to use UTF-32 or UTF-16 for it?
Characters would be Unicode code points (WideString is UTF-32 right?). UTF-16 is a variable-length string encoding. Presumably there will be primitive converters to/from UTF-16 to WideString.
-- Yoshiki
At Thu, 7 May 2009 11:09:32 -0700, Eliot Miranda wrote:
> - remove stack access from the API, writing them as SmartSyntaxPlugins where arguments are passed in as parameters, > returning the result on success and 0 (not SmallInteger 0) on > failure In these days, nobody would care much about it, but this would make it harder to simulate a platform independent performance primitive in the image?
I don't think it makes any difference. In the simulator the VM could e.g. use perform:withArguments: to invoke the primitive. The real VM needs to do something similar and have glue to the platform's native calling convention, which can be as simple as a 32-element switch statement: switch (numArgs) { case 0: result = primitiveFunctionPointer(stackTop()); break; case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break; ... or as sophisticated as machine code generated on the fly.
What I mean was to debug the Slang-ish code in the Smalltalk Debugger. Putting "halt" in the primitive code in Slang and doing #doPrimitive: lets you do it, but code written in SmartSyntaxInterpreter syntax doesn't do what it says so Smalltalk debugger cannot handle it. But again, this is a minor issue now.
> - provide isImmediateObject: and use it in place of isIntegerObject: when the objective is to select heap objects. Use > isCharacterObject: when the objective is to select a character. I > intend to add immediate characters within the next few months. Are you going to use UTF-32 or UTF-16 for it?
Characters would be Unicode code points (WideString is UTF-32 right?). UTF-16 is a variable-length string encoding. Presumably there will be primitive converters to/from UTF-16 to WideString.
Yes, among these choices, my vote would be for UTF-32 (for 21-bit space). But variable-length-ness doesn't really go away when even when using UTF-32, as there are composition characters.
Alternatively, we could go for all UTF-8 in image representation for Strings (as a data buffer) and when you need a Character, create an instance, or return the one in a table, that is in UTF-32. And in the image side, displayable "String" should (almost) always accompany the attributes like Text.
-- Yoshiki
On Thu, May 7, 2009 at 11:29 AM, Yoshiki Ohshima yoshiki@vpri.org wrote:
At Thu, 7 May 2009 11:09:32 -0700, Eliot Miranda wrote:
> - remove stack access from the API, writing them as
SmartSyntaxPlugins where arguments are passed in as
parameters, > returning the result on success and 0 (not SmallInteger 0) on > failure In these days, nobody would care much about it, but this would make it harder to simulate a platform independent performance
primitive in the image?
I don't think it makes any difference. In the simulator the VM could e.g.
use perform:withArguments: to invoke the
primitive. The real VM needs to do something similar and have glue to the
platform's native calling convention, which
can be as simple as a 32-element switch statement: switch (numArgs) { case 0: result = primitiveFunctionPointer(stackTop()); break; case 1: result = primtiveFunctionPointer(stackValue(1),stackTop); break; ... or as sophisticated as machine code generated on the fly.
What I mean was to debug the Slang-ish code in the Smalltalk Debugger. Putting "halt" in the primitive code in Slang and doing #doPrimitive: lets you do it, but code written in SmartSyntaxInterpreter syntax doesn't do what it says so Smalltalk debugger cannot handle it. But again, this is a minor issue now.
Ah, OK, now I get it. I think we can fix this. If the type information is moved into pragmas then I think the debug issue can be made to go away. the simulator would have to read the pragma and type convert before it called perform: but I think this is straight-forward. The pragma could be e.g. performable by the VM to do the type conversion.
> - provide isImmediateObject: and use it in place of
isIntegerObject: when the objective is to select heap objects.
Use > isCharacterObject: when the objective is to select a character. I > intend to add immediate characters within the next few months. Are you going to use UTF-32 or UTF-16 for it?
Characters would be Unicode code points (WideString is UTF-32 right?).
UTF-16 is a variable-length string encoding.
Presumably there will be primitive converters to/from UTF-16 to
WideString.
Yes, among these choices, my vote would be for UTF-32 (for 21-bit space). But variable-length-ness doesn't really go away when even when using UTF-32, as there are composition characters.
Alternatively, we could go for all UTF-8 in image representation for Strings (as a data buffer) and when you need a Character, create an instance, or return the one in a table, that is in UTF-32. And in the image side, displayable "String" should (almost) always accompany the attributes like Text.
I'm a bit out of my depth here. I would have thought that you would want the basic string types to be fixed width for fast accessing, simply because variable length doesn't scale to e.g. indexing 1 megabyte strings. But that for the platform interface one would want efficient conversion to/from fixed and variable length encodings. But that's just my gut. I expect I'll implement whatever y'all say makes sense.
-- Yoshiki
At Thu, 7 May 2009 11:37:10 -0700, Eliot Miranda wrote:
Yes, among these choices, my vote would be for UTF-32 (for 21-bit space). But variable-length-ness doesn't really go away when even when using UTF-32, as there are composition characters. Alternatively, we could go for all UTF-8 in image representation for Strings (as a data buffer) and when you need a Character, create an instance, or return the one in a table, that is in UTF-32. And in the image side, displayable "String" should (almost) always accompany the attributes like Text.
I'm a bit out of my depth here. I would have thought that you would want the basic string types to be fixed width for fast accessing, simply because variable length doesn't scale to e.g. indexing 1 megabyte strings. But that for the platform interface one would want efficient conversion to/from fixed and variable length encodings. But that's just my gut. I expect I'll implement whatever y'all say makes sense.
Basically, I think UTF-32 is ok for the time being and requires very little change to the code.
With the presence of composition characters, the situation where you randomly access to an element and expect it to be a meaningful value itself is rarer.
My proposition is that for a String (as data), we would rather avoid random access anyway and always access it via a Stream. Then, the actual representation can be different.
-- Yoshiki
On Thu, May 07, 2009 at 05:06:48PM +0300, Igor Stasenko wrote:
2009/5/7 Andreas Raab andreas.raab@gmx.de:
Second, there is a higher level notion of whether something is compatible or not - for example the return value from certain functions change in a 64 bit image accordingly (I don't even know how a 32 bit plugin is prevented from interacting with a 64 bit image today). Sometimes you really need a high-level bit that tells you that the world has changed even if the names stay the same.
this can be solved simply: add a function with 32 bit value, which could answer is VM 32 bit, or 64 bit.
It's already there. The expression "self bytesPerWord" translates to either 4 or 8 for 32-bit and 64-bit images respectively.
The implementation is in CCodeGenerator>>generateBytesPerWord:on:indent: in VMMaker since VMMaker-dtl.90 on SqS. The change set is on Mantis 7182 if you need to load it into a different VMMaker.
Dave
On Thu, May 07, 2009 at 11:34:46PM -0400, David T. Lewis wrote:
On Thu, May 07, 2009 at 05:06:48PM +0300, Igor Stasenko wrote:
2009/5/7 Andreas Raab andreas.raab@gmx.de:
Second, there is a higher level notion of whether something is compatible or not - for example the return value from certain functions change in a 64 bit image accordingly (I don't even know how a 32 bit plugin is prevented from interacting with a 64 bit image today). Sometimes you really need a high-level bit that tells you that the world has changed even if the names stay the same.
this can be solved simply: add a function with 32 bit value, which could answer is VM 32 bit, or 64 bit.
It's already there. The expression "self bytesPerWord" translates to either 4 or 8 for 32-bit and 64-bit images respectively.
The implementation is in CCodeGenerator>>generateBytesPerWord:on:indent: in VMMaker since VMMaker-dtl.90 on SqS. The change set is on Mantis 7182 if you need to load it into a different VMMaker.
Apologies, I just realized that I missed your point entirely. For a plugin to check if its compile-time view of "self bytesPerWord" is the same as the bytes per word of the VM that loaded the plugin would presumably require an entry in the interpreter proxy, which does not currently exist. Sorry for the noise.
Dave
On Wed, May 06, 2009 at 10:15:11PM -0700, Andreas Raab wrote:
Second, there is a higher level notion of whether something is compatible or not - for example the return value from certain functions change in a 64 bit image accordingly (I don't even know how a 32 bit plugin is prevented from interacting with a 64 bit image today).
There is nothing to prevent this interaction. In order to confirm the obvious, I copied a UnixOSProcessPlugin compiled for a 64-bit image into the directory containing plugins for a 32-bit image. The plugin loads and runs. A primitive that does not involve stack operations (primitiveGetPid) works fine. Other primitives that need to access the stack result in a VM crash as you might expect.
In practice I have never encountered a case in which I accidentally mixed a plugin for 64-bit images with a VM for 32-bit images (or vice versa). I'm sure it's possible, but it has never happened to me, so this may a problem analogous to 'Smalltalk become: nil' for which the solution is "don't do that".
Dave
Hi Bert,
On Wed, May 6, 2009 at 3:34 AM, Bert Freudenberg bert@freudenbergs.dewrote:
On 06.05.2009, at 02:36, David T. Lewis wrote:
On Tue, May 05, 2009 at 11:37:43AM -0700, John M McIntosh wrote:
Folks should try the fix and see
Ihttp://idisk.mac.com/bertfreudenberg-Public/temp
Double-click EtoysToGo.app and drop in wendy.pr for it to explode, well or not after the fix is applied.
I did not try the exploding wendy.pr test,
Well what I uploaded there is just our not-yet-finished "Etoys To Go" project that's supposed to be runnable from an USB flash drive, on any platform. The VMs for Win and Linux there are not yet updated. Etoys actually does not need closure support (yet) but a feature to resolve relative directories (so the user data will also land on the flash drive). John put that feature in his 4.0 series which supports closures, so I needed to use that one. And then I noticed it blows up on entering sandbox mode.
but your ClosureVMPopKiller-M7349
definitely fixes some stack balance bugs, so I loaded it in the VMMaker project on SqS in VMMaker-dtl.122.
The Mantis 7349 issue is marked as status "testing" since I did not actually perform the wendy.pr test. I'll move it to "resolved" next week if no one cites evidence to the contrary.
r.e. your notes in Mantis 7349:
was there not some recovery code for unbalanced stacks somewhere? My concern is there other example of this which we not yet crashed over. So how would we fix the VM to avoid? Or do we need to check all the plugin prim code for coding issues.
Yes there are sure to be more unbalanced stack bugs, and yes somebody should probably check all the plugin code. Most likely nobody will get around to doing that but no worries, the stack VM seems to do an excellent job of finding these bugs :)
How costly would it be to always do this:
interpreterProxy pop: interpreterProxy methodArgumentCount + 1
thenPush: result
or
interpreterProxy pop: interpreterProxy methodArgumentCount
to return self?
Do the right thing:
interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
It is not hugely expensive compared to the other costs associated with the primitive call, and the interpreterProxy indirection.
In Cog & the StackVM we check the argument count of primitive calls on return and fail the primitive if the stack is incorrect. So checking is easier.
But my main reason for suggesting you do the right thing is that soon enough the execution machinery is going to change and get substantially faster. Igor has some good ideas for eliminating interpreterProxy, deferring whether one uses function pointers or direct references to functions until the time a plugin is compiled, and this makes all these calls much faster. You would still write
interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
but the code generator would spit out
... #if EXTERNAL void (*popthenPush)(sqInt,sqInt); sqInt (*methodArgumentCount); #endif ... popThenPush(methodArgumentCount() + 1, result);
and the function pointers would be intialized in the plugin initializer.
I want eventually to get away from explicit stack manipulation in plugins and write them all rather like smart plugins and have the code generator create vanilla C functions for them, with varargs for primitives that need them. Primitives would return their result, returning 0 (all zeros) to indicate failure. This would eliminate calls on interpreterProxy for argument access and return, leaving conversion, allocation etc. Combining this with Igor's scheme should be a fair bit faster and cleaner than what we have now.
So at least for the moment go for correctness and use smart syntax plugins as much as possible.
2009/5/6 Eliot Miranda eliot.miranda@gmail.com:
Hi Bert,
On Wed, May 6, 2009 at 3:34 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 06.05.2009, at 02:36, David T. Lewis wrote:
On Tue, May 05, 2009 at 11:37:43AM -0700, John M McIntosh wrote:
Folks should try the fix and see
Ihttp://idisk.mac.com/bertfreudenberg-Public/temp
Double-click EtoysToGo.app and drop in wendy.pr for it to explode, well or not after the fix is applied.
I did not try the exploding wendy.pr test,
Well what I uploaded there is just our not-yet-finished "Etoys To Go" project that's supposed to be runnable from an USB flash drive, on any platform. The VMs for Win and Linux there are not yet updated. Etoys actually does not need closure support (yet) but a feature to resolve relative directories (so the user data will also land on the flash drive). John put that feature in his 4.0 series which supports closures, so I needed to use that one. And then I noticed it blows up on entering sandbox mode.
but your ClosureVMPopKiller-M7349 definitely fixes some stack balance bugs, so I loaded it in the VMMaker project on SqS in VMMaker-dtl.122.
The Mantis 7349 issue is marked as status "testing" since I did not actually perform the wendy.pr test. I'll move it to "resolved" next week if no one cites evidence to the contrary.
r.e. your notes in Mantis 7349:
was there not some recovery code for unbalanced stacks somewhere? My concern is there other example of this which we not yet crashed over. So how would we fix the VM to avoid? Or do we need to check all the plugin prim code for coding issues.
Yes there are sure to be more unbalanced stack bugs, and yes somebody should probably check all the plugin code. Most likely nobody will get around to doing that but no worries, the stack VM seems to do an excellent job of finding these bugs :)
How costly would it be to always do this:
interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
or
interpreterProxy pop: interpreterProxy methodArgumentCount
to return self?
Do the right thing: interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result It is not hugely expensive compared to the other costs associated with the primitive call, and the interpreterProxy indirection. In Cog & the StackVM we check the argument count of primitive calls on return and fail the primitive if the stack is incorrect. So checking is easier. But my main reason for suggesting you do the right thing is that soon enough the execution machinery is going to change and get substantially faster. Igor has some good ideas for eliminating interpreterProxy, deferring whether one uses function pointers or direct references to functions until the time a plugin is compiled, and this makes all these calls much faster. You would still write interpreterProxy pop: interpreterProxy methodArgumentCount + 1 thenPush: result
but the code generator would spit out ... #if EXTERNAL void (*popthenPush)(sqInt,sqInt); sqInt (*methodArgumentCount); #endif ... popThenPush(methodArgumentCount() + 1, result); and the function pointers would be intialized in the plugin initializer. I want eventually to get away from explicit stack manipulation in plugins and write them all rather like smart plugins and have the code generator create vanilla C functions for them, with varargs for primitives that need them. Primitives would return their result, returning 0 (all zeros) to indicate failure. This would eliminate calls on interpreterProxy for argument access and return, leaving conversion, allocation etc. Combining this with Igor's scheme should be a fair bit faster and cleaner than what we have now. So at least for the moment go for correctness and use smart syntax plugins as much as possible.
Its could be tedious, but i think that to make things faster it would be better to use the return value from primitive function directly, and stop using the successFlag. While for internal plugins, setting a success flag is as simple as write to memory, for external plugins this costs in additional function call.
For old plugins, we could simply change the code generator to declare successFlag locally. But there are some evil cases, where primitive function calls another platform code, and this code sets successFlag directly, instead of returning a error result, or something else, to indicate failure. That's why i'm saying it could be a bit tedious.
vm-dev@lists.squeakfoundation.org