Hi--
Igor writes:
...any object/class which interfacing with core VM functionality should be in that hierarchy.
Heh, but that just shifts the question over to what exactly "core VM functionality" is. There's a lot of cruft in th VM, too. And I happen to agree about starting from a desired high-level task (that's exactly what I did for Spoon). I'm suspicious of any process which doesn't start from there.
The rest is optional, since by having a compiler we could file in any code we want to.
Well, I prefer to install compiled methods directly without recompiling anything, but sure.
Stephen Pair writes:
Hmm, I think the key question here is: what do you want to be able to do with the image you create?
Sure, I personally think that should be where the process starts (otherwise I suspect unnecessary things get included), but I'm interested in approaches from that point that differ from mine.
I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level
method > that expresses exactly the things that you'd want to be able to do in
this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
This is roughly what I did with Spoon, although my tactic was to mark everything in a normal object memory involved in a particular task, then use the garbage collector to throw away everything else atomically[1]. I like to have a known-working object memory at every point in the process, by dealing with a running memory as much as possible (rather than creating one in situ and hoping that it works when resumed).
Igor responds:
I have a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Squeak already having facilities which can be used for this (MessageTally>>tallySends:). We just need to make some modifications to it, and as you pointed out, since we need a bare minimum, then capture things while running:
FileStream fileIn: 'somecode.st'
is a good starting point.
Right, although I think using the VM to do the marking is more convenient, and faster.
Andreas writes:
The alternative (which I used a couple of years back) is to say: Everything in Kernel-* should make for a self-contained kernel image.
Aha, yeah; I'm not that trusting. :)
So I started by writing a script which would copy all the classes and while doing so rename all references to classes (regardless of whether defined in kernel or not).
At the end of the copying process you end up with a huge number of Undeclared variables. This is your starting point. Go in and add, remove or rewrite classes and methods so that they do not refer to entities outside of your environment. This requires applying some judgment calls, for example I had a category Kernel-Graphics which included Color, Point, and Rectangle. Then I did another pass removing lots of unused methods which I had determined to be unused.
Yeah, that's a lot of work; perhaps on the order of work I was doing earlier in the project, when I was removing things manually with remote tools[2].
At the end of the process I wrote a script that (via some arcane means) did a self-transformation of the image I was running and magically dropped from 30MB to 400k in size. Then I had a hard disk crash and most of the means that I've been using in this work were lost :-(((
Ouch! I'm sorry to hear that. That actually happened to me too (in 2005), but through a total coincidence I had a sufficiently-recent backup to keep going. Several nice minutes of panic...
I still have the resulting image but there is really no realistic way of recovering the process. Which is why I would argue that the better way to go is to write an image compiler that takes packages and compiles them into a new object memory. That way you are
concentrating > on the process rather than on the artifact (in my experience all the
shrinking processes end up with nonrepeatable one-offs)
Oh, I agree that shrinking is not something one should do to produce deployment artifacts. I think it should be done to get a truly minimal memory that can load modules, and then never done again (although the way I do it is repeatable, for the sake of review).
As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
thanks,
-C
[1] http://tinyurl.com/2gbext (lists.squeakfoundation.org) [2] http://tinyurl.com/bdtdlb (lists.squeakfoundation.org)
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)