On Thu, Jan 28, 2010 at 10:42 PM, Colin Putney <cputney@wiresong.ca> wrote:


On 2010-01-27, at 2:04 PM, Levente Uzonyi wrote:

> Hi,
>
> I had an idea a few days ago and even though I don't have the time or knowledge to try it myself, I just can't get it out of my head. The idea is to let an interpreter use two images at once. One of them is read only "fully working" image let's call it S (source), the other is empty (contains no objects), writeable, possibly generated on the fly, let's call it W (working). The vm knows if an object is in S or W by checking the object pointer. Whenever an object in S is about to be modified, a copy is created in W and all references to it are changed to the new one (which means that more than one object might have to be copied). This means a slower startup, but once all necessary objects are copied performance would be normal.
> (This approach is similar to the way sources are handled today: the sources file is read only, new source code goes to the changes file.)

I believe VW can do something like this - they call it "Shared Perm Space." There's a special section of memory that's immutable, not subject to garbage collection, and shared between several VM processes.

It used to exist and then was broken when Barry Hayes and I added memory mapping of new heap segments back in the late 90's.  I was working on bringing it back when I left.

You're almost right (and I'm probably being pedantic; forgive me).  PermSpace (not shared) is a third generation that is not collected unless one does a global GC.  VW has a scavenger, a stop-the-world mark-sweep collector and an incremental mark-sweep collector.  The scavenger collects only new space.  The incremental collector, run in short bursts for a few milliseconds under image-level control, collects oldSpace.  The stop-the-world collector will collect oldSpace or oldSpace + permSpace.  So permSpace is only collected when one does a global stop-the-world collection (globalGarbageCollect) not an oldSpace collection (garbageCollect).  To populate permSpace one does a "perm save" which does an otherwise normal image save that sets a bit in the image header that causes the VM to load the entire image into permSpace.  One then does a globalGarbageCollect and saves, resulting in an image in which most objects are in permSpace (particularly all classes and methods) but where transient objects (font descriptions loaded at startup etc) are in oldSpace.  So the incremental collector, collecting oldSpace, doesn't waste time scan-marking classes and methods, and hence is much more effective.

Shared permSpace extends the scheme by memory mapping an image file's permSpace segment using copy-on-write.  So as objects in permSpace are written to pages of the permSpace part of the image file are copied into private memory.  No effort is made to do things like cluster class variables (which are the most likely targets of writes into permSpace) together on pages to reduce the amount of copying when writes do occur.  A tracer approach would do much better here.

You can infer that memory mapping new oldSpace segments broke shared permSpace because shared permSpace was hacked to map the file at a hard-coded address.  I was trying to bring back shared permSpace for 64-bit images (where it would have more impact because 64-bit objects are bigger) by doing things like aligning the object headers of oldSpace objects on a 16-byte boundary and permSpace objects 8 bytes from a 16-byte boundary so that the permSpace test was a tag test (there being 3 bits of immediate tags).

HTH
Eliot
 

> - combine it with HydraVM, it might give Erlang-like capabilities (cheap
>  and fast processes)

Well, we already have cheap and fast processes. The overhead for creating a new instance of Process and scheduling it is very low. What we lack is isolation between them. Squeak seems to be drifting in that direction, though. Islands are a good start. Josh's recent contribution of futures to the trunk are another step away from shared state concurrency.

My sense of it is that efficient use of memory isn't the most important problem to solve at the moment. Further steps toward event-loop concurrency would be more fruitful.

Colin