On Wed, Jan 12, 2011 at 1:59 AM, Steve Rees <squeak-vm-dev@vimes.worldonline.co.uk> wrote:

Aren't named ivars only accessed by methods of the receiver?

 In classical Smalltalk notionally inst vars are only accessed directly by methods in the receiver's class or superclasses that have an inst size > 0.  However, become: can rebind the receiver so that an illegal direct inst var access can be made.  e.g. create a ByteArray and become it to a Point; sending x to the point accesses a potentially bogus pointer, the raw bits at 0 in the ByteArray.  So if there are extant activations whose receiver is changed through become or changeClass one can observe strange effects and/or crash the system.  The VM make go to some lengths to hide or mitigate these effects.  In my BrouHaHa Smalltalk-80 VM the JIT (bytecode to threaded code) did copy-down so that it could know that the class of self in a threaded code method was constant and so sends to self (~ 40% of all sends) didn't have to be checked.  But a become could change the class of self in an activation and so become also had to scan all activations and rebind the threaded code method in activations on the becommed objects so that self sends remained correct.
Or does Squeak use access sends similar to Strongtalk?

One can of course use accessors as a style, and easily modify the compiler to access them this way, but the system (along with most other Smalltalks) does provide direct access and it is used extensively.

In the former case a check to the topmost frame of each Process' stack, and a check for corpsed objects on each method return should be enough, no?

Checking on return is very expensive.  See my paper on context management in VisualWorks 5i.  Instead one could probably scan as part of of become/change class, but scan only activations in the stack zone and defer scanning contexts in the heap until they were faulted into the stack zone.  Faulting in is expensive anyway so adding a test for a corpse won't add much overhead.

In the latter case the class in the inline cache will differ, giving an opportunity in the lookup to fixup the original. One might also check the corpse bit in ivar accesses to allow an opportunity to replace corpsed references ahead of a full GC.

Again that kind of check isn't cheap.  Remember that the check must be made on inst var reads as well as writes.  When I added immutability to VisualWorks, which tests only writes, the total cost was about 3% to 5%, which was more than acceptable for the benefit, and it was so low precisely because an inst var write required a store check and so part of the immutability test could be folded into the store check, bringing down its overall cost.  But adding a similar check for corpses to inst var reads would probably add costs above 10% and that's getting expensive.  Since inst var access is common and become is relatively rare (we've got to be talking millions to one in normal code, right?) it makes sense to me to put the cost in become and rare operations such as faulting contexts into the stack zone.

The only case where I think this might still be a problem is when Cog does method inlining and has inlined access sends, though even here it would presumably have a type test ahead of the inlined code which the corpse would fail because of the changed class, triggering an uncommon branch or falling back to a traditional non-lined send (depending on the approach Cog uses - I haven't looked at the code). Both of which give an opportunity to handle the corpsed reference, so maybe there won't be a problem here either.

Right.  If one is doing adaptive optimization then become/change class has to take care to preserve optimized code invariants and/or dynamically deoptimize when it violates them.  But Cog doesn't do this /yet/ :)


I think you can also use this for two-way become by cloning both objects, marking each as a corpse and having each corpse refer to the clone of the other.

Right, noting the caveats we've discussed here.  It's all just a small matter of programming :)


Regards, Steve

On 12/01/2011 09:37, Josh Gargus wrote:

On Jan 11, 2011, at 11:48 PM, Igor Stasenko wrote:

Eliot, one thing about 'forwarded' objects, which you calling the
forwarding corpse is that it can be used not only for pinning,
but also with #becomeForward primitive, making it work a lot faster,
since its not require to scan whole heap to update references,
and update can be done during GC.
The only problem, as you pointed out, is the objects which don't have
enough space for forwarding pointer. But for this case, i think the
primitive can fall back and use old slow scheme.
I was thinking the same thing.  But wouldn't there remain the problem that Eliot mentioned about objects with named variables?


You can follow me on twitter at http://twitter.com/smalltalkhacker