Hi All,

    as I've mentioned I've been thinking of implementing a pinning GC for the threaded Cog VM.  Pinning is really important for a threaded FFI since the GC can run in parallel with FFI calls and so move objects while FFI calls are in progress.  The idea then is to arrange that every object passed out through the FFI is pinned.  This is easy to do in a Smalltalk VM that has an object table; an object header bit is used as the "is pinned" flag.  The marshalling code in FFI calls checks the "is pinned" bit and if not set, allocates a clone of the object in a region of the heap that the GC does not compact, with the "is pinned" bit set, and does a become.

In a Smalltalk VM without an object table (Squeak, VisualAge etc) this is more challenging.  But a recent comment on Gilad Bracha's Newspeak blog by "TruePath" (who's he? ed.) points out a technique that applies:
    "Remember smalltalk is a dynamic language so every method call (and that is all there is) requires we check the type pointer and (if the PIC or other caches don't include an appropriate entry) use the type to resolve the method call. One could easily have a special type that escapes into the runtime on any message at which point the runtime replaces the reference the message call was dispatch on with the new location and retries the call. Indeed, any modern GC has to have some way of doing indirection like this so the heap can be compacted."

So the FFI marshalling code checks the "is pinned" bit, and if unset allocates a clone of the object in a region of the heap that the GC does not compact, changes the class/type field of the object to the special "i'm a forwarding corpse" value, and sets a forwarding pointer to the pinned copy.

There are two problems with this.

One, it doesn't work for objects with named inst vars; in Smalltalk an object's named inst vars are accessed directly.  But that's easy; one likely doesn't have any business handing out such objects through the FFI, so we can fail calls that do this, and provide special wrappers to hand-out indirect references to objects with named inst vars.  (There was a discussion on this approach last year; Andreas proposed a generic object handle scheme).

Two, there needs to be room in the corpse for a forwarding pointer to the pinned copy.  There isn't enough space in a zero-sized byte data.  But one has no business handing pout pointers to empty byte data anyway; attempts to write data into them must overwrite the heap, potentially disastrously.  So the FFI can either pass zero-sized objects as null pointers or fail the call.  So there will always be enough room in byte data to corpse it to a pinned copy because we'll only attempt the operation on non-empty objects (and the underlying heap representation will round up the size of any object to at least a pointers width).

So all one needs is a special class marker, say 0, for corpses.  The GC must be modified to follow the forwarding pointer through corpses, as must the message lookup machinery.  The heap must have a region that is not compacted (such as old space, which could have a free list and be compacted only on snapshot - i.e. writing a file that has the free space squeezed out, but that /does not/ move objects in the heap, such as VW's image file format, and hence the compaction actually happens on image load when the objects in the snapshot file are relocated and pointers swizzled).

This should have been obvious to me, but became so only on reading TruePath's comment (thanks again).