Squeak is a growing community with diverse needs. We have long outgrown the monolithic image left to us by our founders, Dan Ingalls and company. The community has long been building its own images for many purposes, targeting many audiences: end-users, application developers, core developers, and researchers. Some examples:
2001: Squeakland Etoys 2002: Croquet 2003: Spoon, by Craig Latta. An experiment in very high modularity 2004: DPON, by Michael van der Gulik: A project to revive Henrik Godenryd's modularity framework abandoned in Squeak 3.3 2005: Morphic removal, by Edgar J. de Cleene 2006: Scratch, by MIT Media Lab: An end-user application for easily building and sharing animations, and teaching basic programming 2007: squeak-dev, by Damien Cassau. A distribution targeted toward application developers. 2008: Pharo, by Stephane Ducasse and others. A team focused on dramatically raising the code quality, usabily, and standards conformance of squeak overall.
We need to burn from our minds the idea that there is one "official" squeak image that serves the needs of the whole community. It is a lie.
Given that no "official image" can meet the needs of everybody, who should we target when building something official?
We need to build things for those who would build better images themselves. Having many good images to choose from makes everybody happier. The only issue with the situation is that they are not always compatible. I believe this is the core issue that the board and the squeak release team needs to address.
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
A more practical approach to that end is Standard Packages. A major issue we have currently is that bugs get fixed in one distribution, but never applied to another. However, if all distributions of squeak used the same version of a package, it would be very easy: fix the bug in the common package, and each distribution will eventually upgrade to it.
I would like the board to do the following project, and I can manage it:
By this time next year, every squeak distribution (squeak.org, Pharo, eToys, cobalt) will be running a standard version of the following three packages: - Collections - Streams - Compiler
We will also fix and close all of the issues on mantis relating to these packages
If we start this process and continue it, we will eventually have enough standard packages to build a kernel image, and everybody will already be running on it.
I do not believe the board should do this to the exclusion of all else, just as one of several projects. I fully support Andreas's election platform: we should indeed start having both a stable and unstable release, both happening in parallel
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
The first step was 3.10 as Ralph and me design and build and Damien use for the dev images. The second step is SqueakLightII , which moves out and lets reload EToys and Nebraska (and others) Also brings the idea of the Class repository and the "intelligent load". This is in beta now and could load old and new code and foreing forks code ins some cases. Only needs help for polish this ideas and reach the common ground to all Squeaks forks.
And we need "official images" , like Linux have a common kernel
A clarification: Was not me the Morphic wizard, was the amazing Morphic Team I and II with Dan , Ned, Juan, Jerome. I only learn a little of they and wish learn a lot of you, Andreas, etc as I learning of all wonderful people in Board today.
Edgar
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene edgardec2001@yahoo.com.arwrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process). These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point. In MicroSqueak a sub-hierarchy rooted at MObject (but it could be MProtoObject) defines the classes that will end-up in the generated micro-kernel image. So this set of classes can be defined as a package and loaded into any Squeak. An image builder analyses the MObject hierarchy and from it generates a new image containing only the classes in that category with all globals renamed from MFoo back to Foo. There are other approaches but John's is a good one. One can test the result within Smalltalk using the IDE. (There are limitations; nil, true, false, Symbol, SmallInteger et al are not rooted in MObject but in Object). One can browse the package using the IDE.
The results of building from this can be recorded, e.g. if one bootstraps a minimal Morphic image from this "micro kernel Squeak image" the minimal Morphic image can itself be a starting point for other images because it is also a known repeatably generatable object. So it too can reliably serve as the seed for other images.
Of course, any image can serve as the seed for any other but if it was built by hand and is ever lost it can never be recreated; at least one can never be sure one has recreated it exactly.
Craig, do you agree?
If so, how much of this do you have already?
If not, what have I got wrong?
BTW, I intend to build something like this when and if I do a new object representation for Cog later this year.
(also see BTW2 below)
The first step was 3.10 as Ralph and me design and build and Damien use for
the dev images. The second step is SqueakLightII , which moves out and lets reload EToys and Nebraska (and others) Also brings the idea of the Class repository and the "intelligent load". This is in beta now and could load old and new code and foreing forks code ins some cases. Only needs help for polish this ideas and reach the common ground to all Squeaks forks.
And we need "official images" , like Linux have a common kernel
A clarification: Was not me the Morphic wizard, was the amazing Morphic Team I and II with Dan , Ned, Juan, Jerome. I only learn a little of they and wish learn a lot of you, Andreas, etc as I learning of all wonderful people in Board today.
Edgar
Yahoo! Cocina
Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
BTW2, IMO this (headless generation) also applies to the VM. VMMaker is fun but difficult to audit, error-prone and source-code-control/repeatability unfriendly. The VMMaker needs to be scriptable so that it can generate VM sources headlessly (easily done; the Newspeak team have already done it). Further, producing different versions of the source for different platforms is questionable. I would arrange that metadata on methods identified platform-specific code, e.g. myWindowsOnlyPrimitive <platforms: #(Win32)> self blah generates #if WIN32 sqInt myWindowsOnlyPrimitive() { blah(); } #endif /* WIN32 */ at least for the core VM, so that people can build a core VM for their platform from a single check-out containing one copy of the sources, not three.
I've recently made one good step in this direction in changing the header VMMaker generates. The exiating one includes the date on which one pushed the button (what use is that?? It tells one *nothing* about what one has produced or where it came from; if one pushes the button starting from exactly the same starting point as yesterday one generates different sources?!?!?!). The change is to state the packages from which it was built, e.g. here revealing one can't trust this because the package isn't checked in (as the * indicates).
/* Automatically generated by CCodeGenerator * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 from SimpleStackBasedCogit * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 */ static char __buildInfo[] = "SimpleStackBasedCogit * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 " __DATE__ ;
Best Eliot
On Sat, 28 Feb 2009 15:35:49 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process).
Sure, and Pavel's has this all, and it's working, no wonder that Edgar often mentions it:
- http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3342635112
These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point.
Interesting. Where is that one, search didn't show it:
- http://www.google.com/search?q=John+Maloney+MicroSqueak
[... much more good stuff cut away ...]
On Sat, Feb 28, 2009 at 7:44 AM, Klaus D. Witzel klaus.witzel@cobss.comwrote:
On Sat, 28 Feb 2009 15:35:49 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process).
Sure, and Pavel's has this all, and it's working, no wonder that Edgar often mentions it:
If it doesn't have a GUI then why is it called MorphicCore??! Reading from the blog entry it looks like it has eToys removed but not much more. Pavel, is it a headless image? Klaus, if the image is not headless ten it doesn't meet my specification.
http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3342635112
These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point.
Interesting. Where is that one, search didn't show it:
[... much more good stuff cut away ...]
-- "If at first, the idea is not absurd, then there is no hope for it". Albert Einstein
KernalImage doesn't have a GUI. It can load Monticello packages. He has a script that will load in MorphicCore.
MorphicCore does (but has many bugs).
MorphicExt is more stable, but larger.
KernalImage is a Squeak image without any GUI. It has collections, numbers, classes, the compiler, but little else. It has a text-based UI that can evaluate Smalltalk expressions, and it can load files from disk. Pavel has a fileIn for Monticello, so then you can load any other package, as long as it doesn't have a GUI.
His other image is called MinimalMorphic. He has a Monticello package for it, so you can load it from KernalImage. MinimalMorphic isn't particularly minimal, but it has had some stuff (like eToys) removed and I think he wants to remove more. It is a basic Morphic environment, showing that you can separate MVC from Morphic.
On Sat, Feb 28, 2009 at 9:51 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sat, Feb 28, 2009 at 7:44 AM, Klaus D. Witzel klaus.witzel@cobss.com wrote:
On Sat, 28 Feb 2009 15:35:49 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process).
Sure, and Pavel's has this all, and it's working, no wonder that Edgar often mentions it:
If it doesn't have a GUI then why is it called MorphicCore??! Reading from the blog entry it looks like it has eToys removed but not much more. Pavel, is it a headless image? Klaus, if the image is not headless ten it doesn't meet my specification.
These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point.
Interesting. Where is that one, search didn't show it:
[... much more good stuff cut away ...]
-- "If at first, the idea is not absurd, then there is no hope for it". Albert Einstein
On Sat, 28 Feb 2009 16:55:38 +0100, David Mitchell wrote:
KernalImage doesn't have a GUI.
Here's a bit more background; Eliot is this headless enough?
- http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-October/096111.h...
It can load Monticello packages. He has a script that will load in MorphicCore.
MorphicCore does (but has many bugs).
MorphicExt is more stable, but larger.
KernalImage is a Squeak image without any GUI. It has collections, numbers, classes, the compiler, but little else. It has a text-based UI that can evaluate Smalltalk expressions, and it can load files from disk. Pavel has a fileIn for Monticello, so then you can load any other package, as long as it doesn't have a GUI.
His other image is called MinimalMorphic. He has a Monticello package for it, so you can load it from KernalImage. MinimalMorphic isn't particularly minimal, but it has had some stuff (like eToys) removed and I think he wants to remove more. It is a basic Morphic environment, showing that you can separate MVC from Morphic.
On Sat, Feb 28, 2009 at 9:51 AM, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 7:44 AM, Klaus D. Witzel wrote:
On Sat, 28 Feb 2009 15:35:49 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process).
Sure, and Pavel's has this all, and it's working, no wonder that Edgar often mentions it:
http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3342635112
If it doesn't have a GUI then why is it called MorphicCore??! Reading from the blog entry it looks like it has eToys removed but not much more. Pavel, is it a headless image? Klaus, if the image is not headless ten it doesn't meet my specification.
These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point.
Interesting. Where is that one, search didn't show it:
[... much more good stuff cut away ...]
-- "If at first, the idea is not absurd, then there is no hope for it". Albert Einstein
On Sat, Feb 28, 2009 at 8:19 AM, Klaus D. Witzel klaus.witzel@cobss.comwrote:
On Sat, 28 Feb 2009 16:55:38 +0100, David Mitchell wrote:
KernalImage doesn't have a GUI.
Here's a bit more background; Eliot is this headless enough?
Yes, this looks good. I would still prefer to go that little bit further and construct the core image from first principles, e.g. using John Maloney's MicroSqueak, as I described earlier in this thread. But Pavel's headless core looks to me to be functionally the right starting point. In any case it can be used to derive the other images while the first-principles bootstrap is being built (if it doesn't exist already).
Why go that "little bit further" and create the image from first principles? Repeatability. The the freedom to choose new object representations and bytecode sets, & hence Bootstrapping new languages like Newspeak is much easier. Hydra might benefit from pre-packaged minimal starting-points that can easily be tailored.
-
http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-October/096111.h...
It can load Monticello packages. He
has a script that will load in MorphicCore.
MorphicCore does (but has many bugs).
MorphicExt is more stable, but larger.
KernalImage is a Squeak image without any GUI. It has collections, numbers, classes, the compiler, but little else. It has a text-based UI that can evaluate Smalltalk expressions, and it can load files from disk. Pavel has a fileIn for Monticello, so then you can load any other package, as long as it doesn't have a GUI.
His other image is called MinimalMorphic. He has a Monticello package for it, so you can load it from KernalImage. MinimalMorphic isn't particularly minimal, but it has had some stuff (like eToys) removed and I think he wants to remove more. It is a basic Morphic environment, showing that you can separate MVC from Morphic.
On Sat, Feb 28, 2009 at 9:51 AM, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 7:44 AM, Klaus D. Witzel wrote:
On Sat, 28 Feb 2009 15:35:49 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene wrote:
> A standard "kernel image" that everyone builds > off of has long > been a pipe dream of nearly everyone in the community. I > believe > that such an image is not achievable in the short term; > convincing all of the squeak distributions to adopt it > would be > nearly impossible to adopt incrementally. >
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it.
All images derived from it should be derived by running scripts (repeatable process).
Sure, and Pavel's has this all, and it's working, no wonder that Edgar often mentions it:
http://www.cincomsmalltalk.com/userblogs/ralph/blogView?entry=3342635112
If it doesn't have a GUI then why is it called MorphicCore??! Reading from the blog entry it looks like it has eToys removed but not much more. Pavel, is it a headless image? Klaus, if the image is not headless ten it doesn't meet my specification.
These scripts should be versioned.
Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point.
Interesting. Where is that one, search didn't show it:
[... much more good stuff cut away ...]
-- "If at first, the idea is not absurd, then there is no hope for it". Albert Einstein
On Sat, 28 Feb 2009 17:56:54 +0100, Eliot Miranda wrote:
On Sat, Feb 28, 2009 at 8:19 AM, Klaus D. Witzel wrote:
On Sat, 28 Feb 2009 16:55:38 +0100, David Mitchell wrote:
KernalImage doesn't have a GUI.
Here's a bit more background; Eliot is this headless enough?
Yes, this looks good. I would still prefer to go that little bit further and construct the core image from first principles, e.g. using John Maloney's MicroSqueak, as I described earlier in this thread. But Pavel's headless core looks to me to be functionally the right starting point. In any case it can be used to derive the other images while the first-principles bootstrap is being built (if it doesn't exist already).
Why go that "little bit further" and create the image from first principles? Repeatability.
Agreed, repeatability, but the language of first principles is set to be Smalltalk and their [principles] "imagination" is objects, so this sounds a bit abstract, no?
The the freedom to choose new object representations and bytecode sets, & hence Bootstrapping new languages like Newspeak is much easier.
Right, this was the reason that Moebius was born (formerly: CorruptVM): have any pair of old and new representations interoperable, same for pairs of old and new instruction sets.
Hydra might benefit from pre-packaged minimal starting-points that can easily be tailored.
:)
--- El sáb 28-feb-09, Eliot Miranda eliot.miranda@gmail.com escribió:
KernalImage doesn't have a GUI.
Here's a bit more background; Eliot is this
headless enough?
Yes, this looks good. I would still prefer to go that little bit further and construct the core image from first principles, e.g. using John Maloney's MicroSqueak, as I described earlier in this thread. But Pavel's headless core looks to me to be functionally the right starting point. In any case it can be used to derive the other images while the first-principles bootstrap is being built (if it doesn't exist already).
This first principles you said could be the Fenix image and technique of Alejandro Reimondo (Chachara files) I take his technique and image and could do a partial port to SqueakLightII and have success in build some of his examples for Big Endian (PPC) , the original was for Windows only (Intel).
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
On 01.03.2009, at 09:59, stephane ducasse wrote:
Were is the code of microSqueak
On John's hard disk. He did it a couple of years ago and showed it to us last summer. As Eliot wrote, it builds an image in memory from a hierarchy of classes. Pretty neat, though probably more a proof of concept than production-ready.
- Bert -
Hi Bert (and Eliot)--
[John Maloney] did [MicroSqueak] a couple of years ago and showed it to us last summer. As Eliot wrote, it builds an image in memory from a hierarchy of classes.
How did he know which classes were necessary for the new image?
thanks,
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
On Sun, Mar 1, 2009 at 4:55 PM, Craig Latta craig@netjam.org wrote:
Hi Bert (and Eliot)--
[John Maloney] did [MicroSqueak] a couple of years ago and showed it to us last summer. As Eliot wrote, it builds an image in memory from a hierarchy of classes.
How did he know which classes were necessary for the new image?
There is a hierarchy rooted at MObject, all of which becomes the new classes in the new image. The generator renames the classes so that in the generated image MObject is called Object and so on for all subclasses.
HTH E.
thanks,
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
Hi Eliot--
How did [John Maloney] know which classes were necessary for [a] new [MicroSqueak] image?
There is a hierarchy rooted at MObject, all of which becomes the new classes in the new image. The generator renames the classes so that in the generated image MObject is called Object and so on for all subclasses.
Right, but how did he know which classes should be in that hierarchy? I.e., how did he decide what was essential and what wasn't?
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
On Sun, Mar 1, 2009 at 5:51 PM, Craig Latta craig@netjam.org wrote:
Hi Eliot--
How did [John Maloney] know which classes were necessary for [a] new [MicroSqueak] image?
There is a hierarchy rooted at MObject, all of which becomes the new classes in the new image. The generator renames the classes so that in the generated image MObject is called Object and so on for all subclasses.
Right, but how did he know which classes should be in that hierarchy?
I.e., how did he decide what was essential and what wasn't?
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right? So if the goal is a microkernel from which one can bootstrap an image one needs a compiler, a file system interface, and for sanity a minimal error-reporting framework right?
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right?
Heh, sure... but you're not answering my question. :) I'm wondering how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.
So if the goal is a microkernel from which one can bootstrap an image one needs a compiler, a file system interface, and for sanity a minimal error-reporting framework right?
Sure, but that doesn't help much with questions like "Should I include method X or not?". The necessity of several things in the system is rather subtle. :) I like to be able to point to any byte in an object memory and give a simple explanation as to why it's there, and have performed a straightforward process to get that explanation.
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
2009/3/2 Craig Latta craig@netjam.org:
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right?
Heh, sure... but you're not answering my question. :) I'm wondering how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.
So if the goal is a microkernel from which one can bootstrap an image one needs a compiler, a file system interface, and for sanity a minimal error-reporting framework right?
Sure, but that doesn't help much with questions like "Should I include method X or not?". The necessity of several things in the system is rather subtle. :) I like to be able to point to any byte in an object memory and give a simple explanation as to why it's there, and have performed a straightforward process to get that explanation.
I think i can give you an idea: any object/class which interfacing with core VM functionality should be in that hierarchy. A starting point - VM requires a special objects array, properly filled with certain objects of certain, expected properties/slots. There is no way how you can avoid providing this information in image without chance of breaking everything. Going further - we could identify all methods which using core set of primitives (mainly - numeric ones) and add them as well. Next - add a basic I/O (make file plugin working). And finally - make a compiler working (in non-interactive mode). The rest is optional, since by having a compiler we could file in any code we want to.
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
Hi--
Igor writes:
...any object/class which interfacing with core VM functionality should be in that hierarchy.
Heh, but that just shifts the question over to what exactly "core VM functionality" is. There's a lot of cruft in th VM, too. And I happen to agree about starting from a desired high-level task (that's exactly what I did for Spoon). I'm suspicious of any process which doesn't start from there.
The rest is optional, since by having a compiler we could file in any code we want to.
Well, I prefer to install compiled methods directly without recompiling anything, but sure.
Stephen Pair writes:
Hmm, I think the key question here is: what do you want to be able to do with the image you create?
Sure, I personally think that should be where the process starts (otherwise I suspect unnecessary things get included), but I'm interested in approaches from that point that differ from mine.
I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level
method > that expresses exactly the things that you'd want to be able to do in
this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
This is roughly what I did with Spoon, although my tactic was to mark everything in a normal object memory involved in a particular task, then use the garbage collector to throw away everything else atomically[1]. I like to have a known-working object memory at every point in the process, by dealing with a running memory as much as possible (rather than creating one in situ and hoping that it works when resumed).
Igor responds:
I have a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Squeak already having facilities which can be used for this (MessageTally>>tallySends:). We just need to make some modifications to it, and as you pointed out, since we need a bare minimum, then capture things while running:
FileStream fileIn: 'somecode.st'
is a good starting point.
Right, although I think using the VM to do the marking is more convenient, and faster.
Andreas writes:
The alternative (which I used a couple of years back) is to say: Everything in Kernel-* should make for a self-contained kernel image.
Aha, yeah; I'm not that trusting. :)
So I started by writing a script which would copy all the classes and while doing so rename all references to classes (regardless of whether defined in kernel or not).
At the end of the copying process you end up with a huge number of Undeclared variables. This is your starting point. Go in and add, remove or rewrite classes and methods so that they do not refer to entities outside of your environment. This requires applying some judgment calls, for example I had a category Kernel-Graphics which included Color, Point, and Rectangle. Then I did another pass removing lots of unused methods which I had determined to be unused.
Yeah, that's a lot of work; perhaps on the order of work I was doing earlier in the project, when I was removing things manually with remote tools[2].
At the end of the process I wrote a script that (via some arcane means) did a self-transformation of the image I was running and magically dropped from 30MB to 400k in size. Then I had a hard disk crash and most of the means that I've been using in this work were lost :-(((
Ouch! I'm sorry to hear that. That actually happened to me too (in 2005), but through a total coincidence I had a sufficiently-recent backup to keep going. Several nice minutes of panic...
I still have the resulting image but there is really no realistic way of recovering the process. Which is why I would argue that the better way to go is to write an image compiler that takes packages and compiles them into a new object memory. That way you are
concentrating > on the process rather than on the artifact (in my experience all the
shrinking processes end up with nonrepeatable one-offs)
Oh, I agree that shrinking is not something one should do to produce deployment artifacts. I think it should be done to get a truly minimal memory that can load modules, and then never done again (although the way I do it is repeatable, for the sake of review).
As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
thanks,
-C
[1] http://tinyurl.com/2gbext (lists.squeakfoundation.org) [2] http://tinyurl.com/bdtdlb (lists.squeakfoundation.org)
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.
E.
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.
E.
Maybe you could get the repeatability with a process that is roughly:
a) write the spec for the capability of the image (a method that exercises everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state)
I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.
- Stephen
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.
E.
Maybe you could get the repeatability with a process that is roughly:
a) write the spec for the capability of the image (a method that exercises everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state)
I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU.
One thing is ensuring that when simulating objects like nil, true and false behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses instances of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more accurate.
- Stephen
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality. E.
Maybe you could get the repeatability with a process that is roughly: a) write the spec for the capability of the image (a method that exercises everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state) I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU. One thing is ensuring that when simulating objects like nil, true and false behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses instances of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more accurate.
Thats why i wrote own parser/compiler in Moebius. It is designed in a way, that a parser & compiler output is under full control of an object which plays role as environment. So, you can produce an instance of CompiledMethod as output, or encode result in machine code, or represent methods as a raw bytes which then could be put in the image you constructing. Even nil,true,false singleton values are under control of environment.
Read more about it here. http://code.google.com/p/moebius-st/wiki/Parser
Simulation of SmallInts could be made easy - we could simply make a class, named BoxedSmallInteger and use it for representing all literal values in methods. At final stage of image creating we can unbox them and replace by smallints. We're in smalltalk, after all, where such things is possible to do, unlike many other languages :)
- Stephen
On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
- Show quoted text -
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net
wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <eliot.miranda@gmail.com
wrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then set it running, or transform an object memory which is always running... I
think
the resulting memory will need to load modules live anyway, so one
might as
well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory
statically
and then set it running? Because of things like delays the always
running
memory is almost never in a predictable state, so one always ends up
with
different bits even if they represent the same functionality. E.
Maybe you could get the repeatability with a process that is roughly: a) write the spec for the capability of the image (a method that
exercises
everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime
analysis
to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks
more
deliberately (so you aren't so tied to the running image's state) I'd think in this way you could do something that was reproducible to
the
extent that resulting image was only dependent on the running image for
its
behaviors and other necessary objects (various singletons and whatnot),
but
otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a
suitable
minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU. One thing is ensuring that when simulating objects like nil, true and
false
behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses
instances
of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more
accurate.
Thats why i wrote own parser/compiler in Moebius. It is designed in a way, that a parser & compiler output is under full control of an object which plays role as environment. So, you can produce an instance of CompiledMethod as output, or encode result in machine code, or represent methods as a raw bytes which then could be put in the image you constructing. Even nil,true,false singleton values are under control of environment.
Read more about it here. http://code.google.com/p/moebius-st/wiki/Parser
Simulation of SmallInts could be made easy - we could simply make a class, named BoxedSmallInteger and use it for representing all literal values in methods. At final stage of image creating we can unbox them and replace by smallints.
Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.
A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage. Any other holes that need to be plugged?
Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.
We're in smalltalk, after all, where such things is possible to do,
unlike many other languages :)
Right on!
- Stephen
- Show quoted text -
-- Best regards, Igor Stasenko AKA sig.
On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
- Show quoted text -
- Show quoted text -
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net
wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda <
eliot.miranda@gmail.com>
wrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote:
[snip] As for whether to produce an object memory statically and then
set
it running, or transform an object memory which is always running...
I think
the resulting memory will need to load modules live anyway, so one
might as
well do all the transformations that way. Perhaps this is simply an aesthetic choice.
Surely repeatability mandates that one roduce an object memory
statically
and then set it running? Because of things like delays the always
running
memory is almost never in a predictable state, so one always ends up
with
different bits even if they represent the same functionality. E.
Maybe you could get the repeatability with a process that is roughly: a) write the spec for the capability of the image (a method that
exercises
everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime
analysis
to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks
more
deliberately (so you aren't so tied to the running image's state) I'd think in this way you could do something that was reproducible to
the
extent that resulting image was only dependent on the running image for
its
behaviors and other necessary objects (various singletons and whatnot),
but
otherwise not affected by various processes and random other things
that
might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a
suitable
minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU. One thing is ensuring that when simulating objects like nil, true and
false
behave as they will in the result, not as defined in the host image.
One
thing one could do is arrange that the compiler for MObject uses
instances
of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more
accurate.
- Show quoted text -
Thats why i wrote own parser/compiler in Moebius. It is designed in a way, that a parser & compiler output is under full control of an object which plays role as environment. So, you can produce an instance of CompiledMethod as output, or encode result in machine code, or represent methods as a raw bytes which then could be put in the image you constructing. Even nil,true,false singleton values are under control of environment.
Read more about it here. http://code.google.com/p/moebius-st/wiki/Parser
Simulation of SmallInts could be made easy - we could simply make a class, named BoxedSmallInteger and use it for representing all literal values in methods. At final stage of image creating we can unbox them and replace by smallints.
Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.
A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage. Any other holes that need to be plugged?
Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.
But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?
How about the remote debugging? How minimal is the debugging stub that must exist in the spawned MImage? Would one need VM changes (e.g. a callback handler for a recursive doesNotUnderstand: error)?
We're in smalltalk, after all, where such things is possible to do,
unlike many other languages :)
Right on!
- Stephen
- Show quoted text -
-- Best regards, Igor Stasenko AKA sig.
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
- Show quoted text -
- Show quoted text -
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote: > > [snip] > As for whether to produce an object memory statically and then > set > it running, or transform an object memory which is always running... > I think > the resulting memory will need to load modules live anyway, so one > might as > well do all the transformations that way. Perhaps this is simply an > aesthetic choice.
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality. E.
Maybe you could get the repeatability with a process that is roughly: a) write the spec for the capability of the image (a method that exercises everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state) I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU. One thing is ensuring that when simulating objects like nil, true and false behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses instances of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more accurate.
- Show quoted text -
Thats why i wrote own parser/compiler in Moebius. It is designed in a way, that a parser & compiler output is under full control of an object which plays role as environment. So, you can produce an instance of CompiledMethod as output, or encode result in machine code, or represent methods as a raw bytes which then could be put in the image you constructing. Even nil,true,false singleton values are under control of environment.
Read more about it here. http://code.google.com/p/moebius-st/wiki/Parser
Simulation of SmallInts could be made easy - we could simply make a class, named BoxedSmallInteger and use it for representing all literal values in methods. At final stage of image creating we can unbox them and replace by smallints.
Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.
right , something like:
MSmallInteger >> #< < object ^ MBoolean from: boxed < object boxedValue
A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage.
why care converting it, when you can simply replace a Message class with MMessage in special objects array, while running a 'sandboxed' code :)
Second thing about #doesNotUnderstand: Often people forgetting that some classes could have own custom #doesNotUnderstand: method. I'd rather do not put any expectations on DNU while designing micro-image bootstrapper.
And i seem missing the direction where this discussion turned out. Where/when we would want to run a code in host environment?
In Moebius, since it is hosted in Squeak, and since we having better parser ;) i could simulate the method behavior at multiple stages including just after parsing a method. I created a simple MockContext class, which can evaluate things strictly in a manner how it is parsed from method source. This, for instance, allows us to test parser output in a complete black-box fashion:
testEvaluateParserOutput | method | method := self parse: ' = x ^ x == self ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(1 1)) == true.]. self assert: [ (CVMockContext evaluate: method arguments: #(true false)) == false.]
testEvaluateParserOutput2 | method | method := self parse: ' foo ^ #(nil true false) ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #(nil true false)]. method := self parse: ' foo ^ #(#a #b #c) ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #(#a #b #c)]. method := self parse: ' foo ^ #() ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #()]. note, that 'method' above is not a CompiledMethod, it is an AST form of parsed source, encoded as lambda message sends.
Any other holes that need to be plugged? Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.
But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?
Right, this is what i'm writing between the lines :) I willing to have a generic toolset which could easily produce a micro-images for any purposes, including a kernel-image, of course. Lately , we discussed a one more little primitive for Hydra with Klaus , and one more kind of channel - an object channel. It will allow you to transfer objects (even cloning a subgraphs) between images , not just dumb raw number of bytes :) This could ease developing tools which require interaction between images, because you don't have to care about serializing/deserializing stuff - you literally just send what you want to other side.
How about the remote debugging? How minimal is the debugging stub that must exist in the spawned MImage? Would one need VM changes (e.g. a callback handler for a recursive doesNotUnderstand: error)?
I think that putting debugger support into VM will be a big mistake. Debugging is a fairly complex domain, and i don't think that we need to deal with this at VM level, where is no objects but oops, headers & bits.. This is right way to get a hellishly complex & unmanageable artifact.
Debugger, as anything else is invoked using regular message send - (during Error>>signal). So, it is easy to hook into it and turn into right direction. I made a simple class HydraDebugToolSet, which replaces an image default toolset for images which running in background. In result, when error happens, it sends an error message to a #transcript channel of main interpreter. Nothing stops us from getting a bit further and request main interpreter to establish a remote debugging session (except that we don't have Debuggers with remote debugging capabilities ;) ). But i know there is already at least one remote debugger implementation in Squeak - GemStone tools. It is using OB tools to generate UI & other stuff. I'm not sure, what license it having, and could it be took as base for remote debugging tool for Squeak. (it would be nice to have a basic remote debugging framework in squeak, which could allow different backends - either G/S , remote socket connection, or via Hydra channels).
We're in smalltalk, after all, where such things is possible to do, unlike many other languages :)
Right on!
- Stephen
- Show quoted text -
-- Best regards, Igor Stasenko AKA sig.
2009/3/3 Igor Stasenko siguctua@gmail.com:
- Show quoted text -
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
On Mon, Mar 2, 2009 at 3:51 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Mon, Mar 2, 2009 at 3:25 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/3 Eliot Miranda eliot.miranda@gmail.com:
- Show quoted text -
- Show quoted text -
On Mon, Mar 2, 2009 at 1:48 PM, Stephen Pair stephen@pairhome.net wrote:
On Mon, Mar 2, 2009 at 12:38 AM, Eliot Miranda eliot.miranda@gmail.com wrote: > > > On Sun, Mar 1, 2009 at 9:20 PM, Craig Latta craig@netjam.org wrote: >> >> [snip] >> As for whether to produce an object memory statically and then >> set >> it running, or transform an object memory which is always running... >> I think >> the resulting memory will need to load modules live anyway, so one >> might as >> well do all the transformations that way. Perhaps this is simply an >> aesthetic choice. > > Surely repeatability mandates that one roduce an object memory > statically > and then set it running? Because of things like delays the always > running > memory is almost never in a predictable state, so one always ends up > with > different bits even if they represent the same functionality. > E.
Maybe you could get the repeatability with a process that is roughly: a) write the spec for the capability of the image (a method that exercises everything you want to be able to do) b) use the class/method copying & DNU trickery and do the runtime analysis to figure out the classes and methods needed to support that capability c) do something a little more surgical to build a new image by copying over the behaviors and methods, but construct the processes and stacks more deliberately (so you aren't so tied to the running image's state) I'd think in this way you could do something that was reproducible to the extent that resulting image was only dependent on the running image for its behaviors and other necessary objects (various singletons and whatnot), but otherwise not affected by various processes and random other things that might be in that image. Once you had (b) and (c) mostly ironed out, it would be a process of refining the specification in (a) to get to a suitable minimal image.
Agreed. The nice thing is being able to run a) in the IDE so that when something is missing it manifests as an Undeclared or an MNU. One thing is ensuring that when simulating objects like nil, true and false behave as they will in the result, not as defined in the host image. One thing one could do is arrange that the compiler for MObject uses instances of MSymbol for all code under MObject. Then e.g. a doesNotUnderstand: handler on SmallInteger, UndefinedObject, Boolean et al might be able to forward things correctly and arrange that the simulation was more accurate.
- Show quoted text -
Thats why i wrote own parser/compiler in Moebius. It is designed in a way, that a parser & compiler output is under full control of an object which plays role as environment. So, you can produce an instance of CompiledMethod as output, or encode result in machine code, or represent methods as a raw bytes which then could be put in the image you constructing. Even nil,true,false singleton values are under control of environment.
Read more about it here. http://code.google.com/p/moebius-st/wiki/Parser
Simulation of SmallInts could be made easy - we could simply make a class, named BoxedSmallInteger and use it for representing all literal values in methods. At final stage of image creating we can unbox them and replace by smallints.
Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump. It must wrap all MSmallInteger relational primitive invocations with code to coerce the Booleran to the matching MBoolean.
right , something like:
MSmallInteger >> #< < object ^ MBoolean from: boxed < object boxedValue
A doesNotUnderstand: will produce an instance of Message and send doesNotUnderstand:, do MObject needs a doesNtUnderstand: handler that sends MSymbol #doesNotUnderstand: with a coercion of the Message to an MMessage.
why care converting it, when you can simply replace a Message class with MMessage in special objects array, while running a 'sandboxed' code :)
Second thing about #doesNotUnderstand: Often people forgetting that some classes could have own custom #doesNotUnderstand: method. I'd rather do not put any expectations on DNU while designing micro-image bootstrapper.
And i seem missing the direction where this discussion turned out. Where/when we would want to run a code in host environment?
In Moebius, since it is hosted in Squeak, and since we having better parser ;) i could simulate the method behavior at multiple stages including just after parsing a method. I created a simple MockContext class, which can evaluate things strictly in a manner how it is parsed from method source. This, for instance, allows us to test parser output in a complete black-box fashion:
testEvaluateParserOutput | method |
method := self parse: ' = x ^ x == self ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(1 1)) == true.]. self assert: [ (CVMockContext evaluate: method arguments: #(true false)) == false.]
testEvaluateParserOutput2 | method | method := self parse: ' foo ^ #(nil true false) ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #(nil true false)]. method := self parse: ' foo ^ #(#a #b #c) ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #(#a #b #c)]. method := self parse: ' foo ^ #() ' class: Object. self assert: [ (CVMockContext evaluate: method arguments: #(nil)) = #()].
note, that 'method' above is not a CompiledMethod, it is an AST form of parsed source, encoded as lambda message sends.
Any other holes that need to be plugged? Alternatively just create a subclass of InstructionStream and/or ContextPart and interpret all code and have the interpretation manage MObject. That might be slow but easier to get going.
But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?
Right, this is what i'm writing between the lines :) I willing to have a generic toolset which could easily produce a micro-images for any purposes, including a kernel-image, of course. Lately , we discussed a one more little primitive for Hydra with Klaus , and one more kind of channel - an object channel. It will allow you to transfer objects (even cloning a subgraphs) between images , not just dumb raw number of bytes :) This could ease developing tools which require interaction between images, because you don't have to care about serializing/deserializing stuff - you literally just send what you want to other side.
A little more about that. VM is capable to recognize a most basic object types (ints, strings, bytes etc), to allow us to transfer objects in JSON-like style. Suppose you want to transfer an instance of class #Foo, which can't be recognized by VM. You can serialize it as an array, like: #(#Foo ivar1value ivar2value ...)
then, at receiving side, it is quite simple to reconstruct it into instance of Foo class. Compare this with amount of processing you may need if you limited only to exchange using raw byte buffers.
How about the remote debugging? How minimal is the debugging stub that must exist in the spawned MImage? Would one need VM changes (e.g. a callback handler for a recursive doesNotUnderstand: error)?
I think that putting debugger support into VM will be a big mistake. Debugging is a fairly complex domain, and i don't think that we need to deal with this at VM level, where is no objects but oops, headers & bits.. This is right way to get a hellishly complex & unmanageable artifact.
Debugger, as anything else is invoked using regular message send - (during Error>>signal). So, it is easy to hook into it and turn into right direction. I made a simple class HydraDebugToolSet, which replaces an image default toolset for images which running in background. In result, when error happens, it sends an error message to a #transcript channel of main interpreter. Nothing stops us from getting a bit further and request main interpreter to establish a remote debugging session (except that we don't have Debuggers with remote debugging capabilities ;) ). But i know there is already at least one remote debugger implementation in Squeak - GemStone tools. It is using OB tools to generate UI & other stuff. I'm not sure, what license it having, and could it be took as base for remote debugging tool for Squeak. (it would be nice to have a basic remote debugging framework in squeak, which could allow different backends - either G/S , remote socket connection, or via Hydra channels).
We're in smalltalk, after all, where such things is possible to do, unlike many other languages :)
Right on!
- Stephen
- Show quoted text -
-- Best regards, Igor Stasenko AKA sig.
--
- Show quoted text -
Best regards, Igor Stasenko AKA sig.
On Mon, Mar 2, 2009 at 6:59 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?
I like it. It made me immediately think of gestation and child birth. You could call this early stage interface the umbilical interface. But seriously, if you really want to get at the very smallest possible starting image, constructing one that is a sort of embryo that is still dependent on its host and unable to live in the world on its own is probably the way to go. This minimal image wouldn't need a file system interface, a compiler, and probably lots of other things that one built to live on its own would need.
- Stephen
2009/3/3 Stephen Pair stephen@pairhome.net:
On Mon, Mar 2, 2009 at 6:59 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
But perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space. So there's a version of the Hydra spawn operation that constructs the heap from MObject. That machinery would be easy to extend, right Igor?
I like it. It made me immediately think of gestation and child birth. You could call this early stage interface the umbilical interface. But seriously, if you really want to get at the very smallest possible starting image, constructing one that is a sort of embryo that is still dependent on its host and unable to live in the world on its own is probably the way to go. This minimal image wouldn't need a file system interface, a compiler, and probably lots of other things that one built to live on its own would need.
Right, it is waits to be implemented. Currently in example of HydraClone>>cloneIdleProcess, i stubbing out all class/metaclass references with a dumb anonymous instances of Class, which having a format field set, and empty method dictionary. This is to make sure that VM will not crash occasionally while stepping out on stubbed class :) To get an effect of host<->embryo relation, we need to invent a special stub, which will carry enough information for passing it to host image and getting back an object which is then #become the real class or method or whatever.
P.S. there is a lot of synergy with a Spoon. Time to time people pointing out on this. I just want to make it clear: I'm aware about it and even think it worth integrating Spoon features with Hydra to not reinvent the wheel, especially on language side.
- Stephen
On Tue, Mar 3, 2009 at 12:51 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
Cool. So the compiler can avoid using the pushNil, pushFalse and pushTrue bytecodes. It must send some message to coerce every MBoolean result into a host Boolean before doing a conditional jump.
Is it possible to have the compiler not generate conditional jumps but rather actually evaluate True>>ifTrue:, False>>ifTrue: etc etc for your own True and False classes (MTrue and MFalse???)?
Is there something I'm forgetting which makes this obviously not work? Are conditional jumps really required?
Gulik.
Hi--
Ah, another day, another omnibus response. :)
Eliot writes:
Surely repeatability mandates that one roduce an object memory statically and then set it running? Because of things like delays the always running memory is almost never in a predictable state, so one always ends up with different bits even if they represent the same functionality.
Sure, but that's all I care about: something minimal for a given set of functionality (in my case, loading the next module). I've made an initial memory from which others can be made by loading modules. If I need to make a new initial memory for some reason, I don't care whether it has every bit in the same place, but I do expect it to have the equivalent objects fulfilling equivalent roles.
I'm after repeatability of functionality without excess. So, no, repeatability doesn't mandate producing an object memory statically from within a host. Simulation was the first thing I tried. In my experience, it has been much easier (although still not a walk in the park :) to avoid simulation for this purpose and use remote messaging on a real object memory instead. I do, however, run the minimal memory in the interpreter simulator occasionally, to debug and produce visualizations[1].
...perhaps a better alternative is just to use Hydra and provide a remote debugging interface to another Hydra space.
...
How about the remote debugging? How minimal is the debugging stub that must exist in the spawned MImage? Would one need VM changes...?
Just a reminder at this point that Spoon has remote debugging between object memories, with support for context stacks spanning multiple physical machines. It uses remote messaging, which uses a small change to the VM's method lookup, but otherwise no special VM support is necessary. And no need for a myriad of shadow classes (MBoolean, MockContexts, specialized parsers, et al).
Igor writes:
...there is a lot of synergy with Spoon. People point this out from time to time.
Yes, this would be one of those times. :) I only wish that I could have finished this while I was on break, or that I could be employed to do this. :)
Janko writes:
A question from someone not so [knowledgeable about] Smalltalk internals: is Spoon compatible with the proposed MicroSqueak? That is, can it be Spoon based on top of MicroSqueak?
I haven't seen any indication that MicroSqueak is actually minimal, so I'm not sure why one would want to do that. It seems like Spoon and MicroSqueak are two fundamentally different approaches. Spoon isn't really something that runs "on top of" something else; once you add anything more to it, it's not minimal anymore.
Now, if instead you meant Naiad (Spoon's module system) and Other (Spoon's remote messaging framework)... sure, those will run in any Smalltalk. They have to, in order to provide a convenient way of moving code between old and new systems.
Stephen Pair writes:
Maybe you could get the repeatability with a process that is roughly...
Hmm, didn't you just write that, and didn't I respond that I'd already done an equivalent thing? :)
Jecel writes:
You are probably aware of the type inference work Ole Agesen did in Self?
Yes, thanks!
Igor writes:
I having a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Göran responds:
Mmm, you are aware of this stuff that Craig has in Spoon right? The "imprinting" stuff IIRC.
(Thanks, Göran!)
Igor responds:
sure, I'm aware of that. Craig using changed VM to mark methods while code running. But one could do much the same using already awailable tools. Yes, it will be slower, but this process (shrinking) don't have to be performed regularily, so why care.
True, imprinting uses method activation marking, but imprinting has nothing to do with shrinking. Imprinting is useful for transferring methods from one object memory to another as they are run, in real-time. If the marking weren't done in the VM, it wouldn't work. Given that, it really was easiest to just do shrinking with a slightly-modified version of the garbage collector. I think it makes a lot of sense.
But speaking of shrinking, I'll just reiterate the most extreme result so far, a 1337-byte object memory[2], suitable for t-shirts...
As an aside... I'm still amazed at how apprehensive people are about modifying the VM, after all the time Squeak's been around. The relative ease with which one can do it, with the ability to debug using Smalltalk tools, is one of the main compelling things about the system. It lets us just go ahead and change the VM *if that's what's appropriate*, rather than try to work around it as with previous systems. (True, it can be much easier still... and we'll get there faster if more people dive in.)
***
Finally, an obligatory repetition of what I'm working on now: I'm implementing Naiad, Spoon's module system[3]. I have a headful object memory with editions in it describing its classes, methods, modules, etc. (see [3] for terminology). I have a minimal object memory, and I have another headful memory with tools in it for manipulating memories remotely (remote system browser, remote inspectors, remote debugger, etc.).
Both the minimal memory and the tools memory can connect to the history memory and use that instead of a changes file. I can do traditional things like looking up versions of a particular method, but also more sophisticated queries (for example, methods written by a particular author over some time period, removed over some other period, and that access a particular instance variable).
My current task is creating a minimal history memory, to go along with the minimal memory. I'm transferring editions for all the components of the minimal memory into a copy of the minimal memory, and fixing bugs that I uncover in the process. Then I'll have the pieces of the next Spoon system: a minimal object memory, and a minimal history memory that describes it. I will release that, along with changesets for the remote tools that previous (3.2 to 4.0) object memories can use. Then people can start composing Naiad modules for all the behavior that I removed (e.g., graphics support).
I've made several releases of Spoon in the past (most notably 2004-02-14, 2005-12-11, 2006-10-25, and 2007-04-12), but every time the lack of the module system limited the pool of truly interested folks to a handful. Given how difficult it is to make releases before the module system exists, I've decided to focus on the module system (asking for feedback about the module system design[3] in the meantime). I still think it's going to be worth the time, and making it my day job would still speed it up.
-C
[1] http://netjam.org/spoon/viz [2] http://netjam.org/spoon/smallest [3] http://netjam.org/spoon/naiad
Hi Craig,
A question from someone not knowledged so deep in Smalltalk internals: is Spoon compatible with the proposed MicroSqueak?
That is, can it be Spoon based on top of MicroSqueak? Even more, what preconditions needs MicroSqueak to have to run Spoon on top?
Best regards Janko
Craig Latta pravi:
Hi--
Igor writes:
...any object/class which interfacing with core VM functionality should be in that hierarchy.
Heh, but that just shifts the question over to what exactly "core
VM functionality" is. There's a lot of cruft in th VM, too. And I happen to agree about starting from a desired high-level task (that's exactly what I did for Spoon). I'm suspicious of any process which doesn't start from there.
The rest is optional, since by having a compiler we could file in any code we want to.
Well, I prefer to install compiled methods directly without
recompiling anything, but sure.
Stephen Pair writes:
Hmm, I think the key question here is: what do you want to be able to do with the image you create?
Sure, I personally think that should be where the process starts
(otherwise I suspect unnecessary things get included), but I'm interested in approaches from that point that differ from mine.
I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level method that expresses exactly the things that you'd want to be able to do in this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
This is roughly what I did with Spoon, although my tactic was to
mark everything in a normal object memory involved in a particular task, then use the garbage collector to throw away everything else atomically[1]. I like to have a known-working object memory at every point in the process, by dealing with a running memory as much as possible (rather than creating one in situ and hoping that it works when resumed).
Igor responds:
I have a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Squeak already having facilities which can be used for this (MessageTally>>tallySends:). We just need to make some modifications to it, and as you pointed out, since we need a bare minimum, then capture things while running:
FileStream fileIn: 'somecode.st'
is a good starting point.
Right, although I think using the VM to do the marking is more
convenient, and faster.
Andreas writes:
The alternative (which I used a couple of years back) is to say: Everything in Kernel-* should make for a self-contained kernel image.
Aha, yeah; I'm not that trusting. :)
So I started by writing a script which would copy all the classes and while doing so rename all references to classes (regardless of whether defined in kernel or not).
At the end of the copying process you end up with a huge number of Undeclared variables. This is your starting point. Go in and add, remove or rewrite classes and methods so that they do not refer to entities outside of your environment. This requires applying some judgment calls, for example I had a category Kernel-Graphics which included Color, Point, and Rectangle. Then I did another pass removing lots of unused methods which I had determined to be unused.
Yeah, that's a lot of work; perhaps on the order of work I was
doing earlier in the project, when I was removing things manually with remote tools[2].
At the end of the process I wrote a script that (via some arcane means) did a self-transformation of the image I was running and magically dropped from 30MB to 400k in size. Then I had a hard disk crash and most of the means that I've been using in this work were lost :-(((
Ouch! I'm sorry to hear that. That actually happened to me too (in
2005), but through a total coincidence I had a sufficiently-recent backup to keep going. Several nice minutes of panic...
I still have the resulting image but there is really no realistic way of recovering the process. Which is why I would argue that the better way to go is to write an image compiler that takes packages and compiles them into a new object memory. That way you are concentrating on the process rather than on the artifact (in my experience all the shrinking processes end up with nonrepeatable one-offs)
Oh, I agree that shrinking is not something one should do to
produce deployment artifacts. I think it should be done to get a truly minimal memory that can load modules, and then never done again (although the way I do it is repeatable, for the sake of review).
As for whether to produce an object memory statically and then set
it running, or transform an object memory which is always running... I think the resulting memory will need to load modules live anyway, so one might as well do all the transformations that way. Perhaps this is simply an aesthetic choice.
thanks,
-C
[1] http://tinyurl.com/2gbext (lists.squeakfoundation.org) [2] http://tinyurl.com/bdtdlb (lists.squeakfoundation.org)
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
Craig Latta wrote on Sun, 01 Mar 2009 21:20:31 -0800
Sure, I personally think that should be where the process starts
(otherwise I suspect unnecessary things get included), but I'm interested in approaches from that point that differ from mine.
You are probably aware of the type inference work Ole Agesen did in Self?
http://selflanguage.org/documentation/published/gold.html
Some of this has since been done for Squeak as well, but not (as far as I know) for generating minimal images. Of course, type inference has trouble with things like #perform. In the case of Self, primitive failures did silly string manipulation and that caused a lot of unrelated code (string stuff) to get pulled into every image. Another source of "leaks" was that even if you only used integer math in your application, the type inferencer couldn't prove that floating point would never be needed due to the way a few Integer methods were written.
In the end, it seems likely to me that the best result will be obtained by a combination of methods.
-- Jecel
On Sun, Mar 1, 2009 at 10:04 PM, Craig Latta craig@netjam.org wrote:
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right?
Heh, sure... but you're not answering my question. :) I'm wondering
how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.
Hmm, I think the key question here is: what do you want to be able to do with the image you create? I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level method that expresses exactly the things that you'd want to be able to do in this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
- Stephen
2009/3/2 Stephen Pair stephen@pairhome.net:
On Sun, Mar 1, 2009 at 10:04 PM, Craig Latta craig@netjam.org wrote:
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right?
Heh, sure... but you're not answering my question. :) I'm wondering how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.
Hmm, I think the key question here is: what do you want to be able to do with the image you create? I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level method that expresses exactly the things that you'd want to be able to do in this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
- Stephen
I having a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Squeak already having facilities which can be used for this: MessageTally tallySends: aBlock .
we just need to make some modifications to it, and as you pointed out, since we need a bare minimum, then capturing things while running:
FileStream fileIn: 'somecode.st'
is a good starting point.
Igor Stasenko wrote:
I having a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Mmm, you are aware of this stuff that Craig has in Spoon right? The "imprinting" stuff IIRC.
regards, Göran
2009/3/2 Göran Krampe goran@krampe.se:
Igor Stasenko wrote:
I having a similar idea to capture methods/classes while running code to discover what objects i need to clone into separate heap to make it running under another interpreter instance in Hydra.
Mmm, you are aware of this stuff that Craig has in Spoon right? The "imprinting" stuff IIRC.
sure, i'm aware of that. Craig using changed VM to mark methods while code running. But one could do much the same using already awailable tools. Yes, it will be slower, but this process (shrinking) don't have to be performed regularily, so why care.
regards, Göran
Stephen Pair wrote:
Hmm, I think the key question here is: what do you want to be able to do with the image you create? I've no idea how John did this, but I think I would start by writing a method that did all the things I'd want to be able to do in the image I ultimately create. This might be things like compiling a string of code, loading something from a file (or socket), etc...the bare minimum of things you'd absolutely need to do in a minimal image. Then I'd guess you could create a shadow M* class with this method in it and some sort of proxies for literals and global references. With a little doesNotUnderstand: magic I'd think you could run that method, hit those proxies and copy over other classes and methods as you hit them. I'm sure the devil is in the details, but this top level method that expresses exactly the things that you'd want to be able to do in this minimal image seems like it would be an interesting artifact in and of itself. It would be in a sense a specification of this image.
That is one way to do it. The alternative (which I used a couple of years back) is to say: Everything in Kernel-* should make for a self-contained kernel image. So I started by writing a script which would copy all the classes and while doing so rename all references to classes (regardless of whether defined in kernel or not).
At the end of the copying process you end up with a huge number of Undeclared variables. This is your starting point. Go in and add, remove or rewrite classes and methods so that they do not refer to entities outside of your environment. This requires applying some judgement calls, for example I had a category Kernel-Graphics which included Color, Point, and Rectangle. Then I did another pass removing lots of unused methods which I had determined to be unused.
At the end of the process I wrote a script that (via some arcane means) did a self-transformation of the image I was running and magically dropped from 30MB to 400k in size. Then I had a hard disk crash and most of the means that I've been using in this work were lost :-(((
I still have the resulting image but there is really no realistic way of recovering the process. Which is why I would argue that the better way to go is to write an image compiler that takes packages and compiles them into a new object memory. That way you are concentrating on the process rather than on the artifact (in my experience all the shrinking processes end up with nonrepeatable one-offs)
Cheers, - Andreas
On Sun, Mar 1, 2009 at 7:04 PM, Craig Latta craig@netjam.org wrote:
You should ask John, but the aim was to get a minimal Hello World so there isn't a compiler for example. But surely the answer is "whatever you need to get what you want done", right?
Heh, sure... but you're not answering my question. :) I'm wondering
how others have gone about deriving the complete set of classes and methods a system needs to perform a particular task (so that I can compare to how I do it). Indeed, I should ask John.
Ah, well the way John does it in MicroSqueak is that one can run whatever the expression one wants the new image to evaluate in the simulation hierarchy MObject. So one can experiment. There are limitations as I mentioned (one is using the host system's Booleans and Numbers, and Array from brace cinstructs, and Message from doesNotUnderstand:) and one still has to figure out what needs to be in the specialObjectsArray. But as far as what "main" does one can test that.
So if the goal is a microkernel from which one can bootstrap an image
one needs a compiler, a file system interface, and for sanity a minimal error-reporting framework right?
Sure, but that doesn't help much with questions like "Should I include
method X or not?". The necessity of several things in the system is rather subtle. :) I like to be able to point to any byte in an object memory and give a simple explanation as to why it's there, and have performed a straightforward process to get that explanation.
Right. The simulation of "main" (plus your coverage tool?) answers that.
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
2009/2/28 Eliot Miranda eliot.miranda@gmail.com:
On Sat, Feb 28, 2009 at 1:17 AM, edgar De Cleene edgardec2001@yahoo.com.ar wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
Any image containing a GUI is a non-starter IMO. People may not want a GUI (e.g. the embedded and scripting folks). People may want a particular GUI (MVC, Morphic, Tweak, Newspeak, Croquet, one of the native GUIs) with no vestiges of the old one. So the common image needs to be a small headless core that can bootstrap any image. This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface), a compiler with which to compile code, collections, magnitudes, exceptions (as necessary), a default error handler that dumps the stack to stdout and then aborts, and that's about it. All images derived from it should be derived by running scripts (repeatable process). These scripts should be versioned. Further, this initial image should be built from scratch, e.g. using John Maloney's MicroSqueak as a starting point. In MicroSqueak a sub-hierarchy rooted at MObject (but it could be MProtoObject) defines the classes that will end-up in the generated micro-kernel image. So this set of classes can be defined as a package and loaded into any Squeak. An image builder analyses the MObject hierarchy and from it generates a new image containing only the classes in that category with all globals renamed from MFoo back to Foo. There are other approaches but John's is a good one. One can test the result within Smalltalk using the IDE. (There are limitations; nil, true, false, Symbol, SmallInteger et al are not rooted in MObject but in Object). One can browse the package using the IDE. The results of building from this can be recorded, e.g. if one bootstraps a minimal Morphic image from this "micro kernel Squeak image" the minimal Morphic image can itself be a starting point for other images because it is also a known repeatably generatable object. So it too can reliably serve as the seed for other images. Of course, any image can serve as the seed for any other but if it was built by hand and is ever lost it can never be recreated; at least one can never be sure one has recreated it exactly. Craig, do you agree? If so, how much of this do you have already? If not, what have I got wrong?
Let me put my 5 cents :) I personally, don't see anything wrong. I think things like micro-kernel, mini-image, kernel-image (or call it as you like) should be there from a very starting of existance of Squeak, but i can only guess why its not existing.
Since the first time i met the squeak, an inability to bootstap own image were always chasing me. I didn't wanted a pre-built image, like 3.8. I wanted my own, small image which contains as little as it can to be able to run my own code. I din't care about rest. Then i found that there is no such thing (or there is, but solutions is not nearly as trivial as they should be), and this was very disappointing to me. :(
I know, that paradigm behind every smalltalk is to view an object memory as a set of ever-evolving living objects.
From that point, the need of reiterating the Process of Creation
(bootstrapping) having a little value. But following this road, it lost a most valuable things , like modularity and ability to automatically strip down to a core. It vent to a point, that cutting out an 'optional' elements from image, like Morphic became an untrivial trask which requires high expertise, and cosumes a lot of time. I think everyone agrees, that well designed system should allow us to do that easily, without risk of crippling the neck while trying :)
BTW, I intend to build something like this when and if I do a new object representation for Cog later this year. (also see BTW2 below)
The first step was 3.10 as Ralph and me design and build and Damien use for the dev images. The second step is SqueakLightII , which moves out and lets reload EToys and Nebraska (and others) Also brings the idea of the Class repository and the "intelligent load". This is in beta now and could load old and new code and foreing forks code ins some cases. Only needs help for polish this ideas and reach the common ground to all Squeaks forks.
And we need "official images" , like Linux have a common kernel
A clarification: Was not me the Morphic wizard, was the amazing Morphic Team I and II with Dan , Ned, Juan, Jerome. I only learn a little of they and wish learn a lot of you, Andreas, etc as I learning of all wonderful people in Board today.
Edgar
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
BTW2, IMO this (headless generation) also applies to the VM. VMMaker is fun but difficult to audit, error-prone and source-code-control/repeatability unfriendly. The VMMaker needs to be scriptable so that it can generate VM sources headlessly (easily done; the Newspeak team have already done it). Further, producing different versions of the source for different platforms is questionable. I would arrange that metadata on methods identified platform-specific code, e.g. myWindowsOnlyPrimitive <platforms: #(Win32)> self blah generates #if WIN32 sqInt myWindowsOnlyPrimitive() { blah(); } #endif /* WIN32 */ at least for the core VM, so that people can build a core VM for their platform from a single check-out containing one copy of the sources, not three.
I would do it different, and in most same way as we doing it in smalltalk from day to day: make a class MyOwnPlugin - put a methods, which is common for all platforms, - put #isAbstract to return 'true' on class side then you're free to add MyOwnWIN32Plugin MyOwnUnixPlugin etc.
It would be much more elegant than using C-isms which producing #if-s around the code. A generated code will be clean, and more easy to read/audit/whatever. :) Lets not be lazy , sometimes you could have only a single method, which behaves differently for different platfroms, but putting it into a subclass will serve much more for clarity.
And the last thing, i dreaming to have everything in VMMaker / smalltalk classes. To make every bit of code sitting in slang code and get rid of manually crafted sources. Yes, we need to improve VMMaker and slang syntax to make things more easy, especially make things easier than writing C code manually.
VMMaker, is a metaprogramming tool, and we could modify it particularily easy comparing to modification of manually-crafted sources. (a Hydra is a good example of it -- i changed VMMaker to produce a code which could run multiple interpreter instances , while keeping most of the code (> 99%) in ObjectMemory and Interpreter classes untouched)
Imagine how much harder it would be if, we don't having VMMaker and VM sources consisting from manually crafted C sources.
I've recently made one good step in this direction in changing the header VMMaker generates. The exiating one includes the date on which one pushed the button (what use is that?? It tells one *nothing* about what one has produced or where it came from; if one pushes the button starting from exactly the same starting point as yesterday one generates different sources?!?!?!). The change is to state the packages from which it was built, e.g. here revealing one can't trust this because the package isn't checked in (as the * indicates). /* Automatically generated by CCodeGenerator * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 from SimpleStackBasedCogit * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 */ static char __buildInfo[] = "SimpleStackBasedCogit * VMMaker-eem.293 uuid: dff7906f-2c49-4278-9401-8bccc2e6ef13 " __DATE__ ;
good point.
Best Eliot
Hi Eliot--
Craig, do you agree?
Yes.
If so, how much of this do you have already?
Well, you mentioned:
This image needs minimal scripting support to respond to command-line bootstrap commands (including cross-platform stdin & stdout and a file interface)...
I do this through a web interface, which I prefer, but a module to do this through command-line parameters, stdin, and files would be straightforward.
...a compiler with which to compile code...
This is already a loadable thing, but I rarely load it because I don't need the compiler to install compiled methods.
...collections, magnitudes, exceptions (as necessary)...
Yes.
a default error handler that dumps the stack to stdout and then abort...
Yes.
...and that's about it.
Okay, there you go.
thanks,
-C
-- Craig Latta www.netjam.org next show: 2009-03-13 (www.thishere.org)
The KernelImage ZOO including the building environment is here: http://squeak.cz/public/pub/KernelImage/current/
Of course it is not perfect - the smallest image works only in Linux and MorphicCore needs a lot of fixes so people should start with MorphicExt and so on but it is usable starting point if someone is interested...
-- Pavel
On Sat, Feb 28, 2009 at 10:17 AM, edgar De Cleene edgardec2001@yahoo.com.ar wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
The first step was 3.10 as Ralph and me design and build and Damien use for the dev images. The second step is SqueakLightII , which moves out and lets reload EToys and Nebraska (and others) Also brings the idea of the Class repository and the "intelligent load". This is in beta now and could load old and new code and foreing forks code ins some cases. Only needs help for polish this ideas and reach the common ground to all Squeaks forks.
And we need "official images" , like Linux have a common kernel
A clarification: Was not me the Morphic wizard, was the amazing Morphic Team I and II with Dan , Ned, Juan, Jerome. I only learn a little of they and wish learn a lot of you, Andreas, etc as I learning of all wonderful people in Board today.
Edgar
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
Hi All,
edgar De Cleene wrote:
A standard "kernel image" that everyone builds off of has long been a pipe dream of nearly everyone in the community. I believe that such an image is not achievable in the short term; convincing all of the squeak distributions to adopt it would be nearly impossible to adopt incrementally.
Such image exist and is MorphicCore of Pavel Krivanek. We should go towards this , removing packages from the top and reshaping packages if packages as we know today can't be unloades/loaded nicely
OK, so then we have a minimal image. How then do we see it is used as the kernel for etoys, croquet, spoon, pharo, etc? I think Matthew's point is less about producing a minimal image and more about forks standardising on core packages incrementally until they eventually agree with each other on some notion of a kernel image.
If I understand correctly the images mentioned in this thread, SqueakLightII, MorphicCore etc would all be expected to adopt these standard packages.
My suggestion is that only an image which includes standard packages alone should be called a core image (and then only if one is needed? Would it be needed?). I'd reserve the term kernel image for the types of images Eliot and others are discussing.
- Zulq
OK, so then we have a minimal image. How then do we see it is used as the kernel for etoys, croquet, spoon, pharo, etc? I think Matthew's point is less about producing a minimal image and more about forks standardising on core packages incrementally until they eventually agree with each other on some notion of a kernel image.
If I understand correctly the images mentioned in this thread, SqueakLightII, MorphicCore etc would all be expected to adopt these standard packages.
My suggestion is that only an image which includes standard packages alone should be called a core image (and then only if one is needed? Would it be needed?). I'd reserve the term kernel image for the types of images Eliot and others are discussing.
- Zulq
Seems no one heard a word Matthew said. The idea that all those different forks will adopt some minimal kernel image is frankly absurd and highly unlikely to ever happen. Matthew's correct, the best we could hope for would be to get them using common core packages and even that isn't trivial and requires workers rather than board members.
Andreas is the only person I've seen proposing anything at all pragmatic for the board to do. They should stop trying to plan the future and start harvesting the present, get things that are done and proven into the main image. If it isn't done, and already in use by some subset of the community then it shouldn't exist in the eyes of the board.
This idea that the board should be leading things is frankly flawed, all the forked version of and sub-communities of Squeak are the real leaders, they're the ones doing the work, experimenting with new ideas both good and bad, and making progress, the board should be following them and picking out the usable pieces for inclusion in the mainline Squeak, or better yet, Matthew's idea of core packages and forget the idea of a main Squeak image altogether.
Ramon Leon http://onsmalltalk.com
Hi Ramon,
Ramon Leon wrote:
Seems no one heard a word Matthew said. The idea that all those different forks will adopt some minimal kernel image is frankly absurd and highly unlikely to ever happen. Matthew's correct, the best we could hope for would be to get them using common core packages and even that isn't trivial and requires workers rather than board members.
I was trying to guide the discussion back to Matthews proposal which I agree with. I'm not suggesting they should adopt any image. I do think however, that when a notion of a core/kernel image (set of Standard packages) is converged upon then an image is built and released containing these packages - perhaps a conversation for another day/thread.
I think getting the various communities to buy in and commit to a vision of shared core packages should be the responsibility of the board. It should be in each communities interest to work on such an activity. This is where the real work will need to be done.
Andreas is the only person I've seen proposing anything at all pragmatic for the board to do. SNIP
What isn't pragmatic about Matthew's proposal? I don't think he suggests that the board does this themselves - maybe I misunderstood.
Matthew - could you explain further what you see are the boards roles and responsibilities in a Standard Packages project?
This idea that the board should be leading things is frankly flawed, all the forked version of and sub-communities of Squeak are the real leaders, they're the ones doing the work, experimenting with new ideas both good and bad, and making progress, the board should be following them and picking out the usable pieces for inclusion in the mainline Squeak, or better yet, Matthew's idea of core packages and forget the idea of a main Squeak image altogether.
I don't think the board should be following and picking. I don't think that works. The sub communities should be playing an active role in this and I think it should be the boards responsibility to:
- Convince each community that it's worth doing - Find out and agree what needs to be done - Enable each community to work - Remove barriers - Define processes - Resources: tools, email lists, servers, etc - Manage the overall progress and lend support where necessary
- Zulq
I was trying to guide the discussion back to Matthews proposal which I agree with. I'm not suggesting they should adopt any image.
You misunderstand, I was agreeing with you, the comment about what was absurd wasn't in reply to you but to the previous suggestion that all these different forks could be convinced to come back to a core image; just isn't going to happen.
I think getting the various communities to buy in and commit to a vision of shared core packages should be the responsibility of the board. It should be in each communities interest to work on such an activity. This is where the real work will need to be done.
Again, I was agreeing with that position.
Andreas is the only person I've seen proposing anything at all pragmatic
for the board to do. SNIP
What isn't pragmatic about Matthew's proposal? I don't think he suggests that the board does this themselves - maybe I misunderstood.
I like his proposal, that doesn't make it pragmatic, at least, not as pragmatic as what Andreas is always saying, harvest what's done, rather than making big plans for doing stuff that the past has shown rarely works out.
One of Squeaks big problems, IMHO, is that every version is planned and realised by someone grand idea of features that should be in the next version. More often than not, this is vaporware, work someone wants to do, or thinks can be done, but isn't done, so everyone waits and waits and waits for a new version.
I'd much rather see releases done by specific dates, like a release every 6 months, each release harvesting the latest fixes and patches of accomplished work. Make it easier to get new stuff integrated into the core by releasing regularly and often rather than these pie in the sky visions of what might be. Guaranteed steady gradual evolution and continual progress beats the hell out of revolutionary grand changes that might or might not actually happen.
I don't think the board should be following and picking. I don't think that works. The sub communities should be playing an active role in this and I think it should be the boards responsibility to:
Convince each community that it's worth doing
Find out and agree what needs to be done
Enable each community to work
- Remove barriers
- Define processes
- Resources: tools, email lists, servers, etc
Manage the overall progress and lend support where necessary
Zulq
Which is what they've been doing, it's not been working so well. The Squeak community wouldn't be splintered into so many fragments otherwise. The Pharo guys forked because it's the only way to get anything done, it takes too long to try and get anything real done in the core Squeak, progress is glacial, so everyone forks.
Keith and Matthew are right, focus on packages and tools for sharing code, Andreas is right, focus on what's done, integrate it, make progress, stop the pie in the sky dreams of the ultimate system, or core kernel image, or whatever other pipe dream people keep chasing that clearly isn't being used by all the sub communities anyway.
I don't care what image I use, I care what packages I use. I care that my Collections package is the latest and greatest, or that my Seaside package is up to date, or that I'm using the newest FFI, or the right Polymorph. I don't care about eToys or Graphics or 3D or anything to do with educating kids. Packages are more important than images. The only thing I want from an image is that it isn't bloated with a bunch of unloadable code like eToys that infects everything and breaks everything when you try and remove it.
Elliot is right about building images from scripts as well, I want to automate the builds of the images I use with a script to load onto a base image just the packages I want. Damien does this now with his dev images, a good starting point for me, then I use a script and Keiths installer to further load and customize it to my liking.
Anyway, now I'm just rambling, so I'll leave it at that. +1 for tools, Packages, and integrate what's done and get it out.
Ramon Leon http://onsmalltalk.com
On Sat, Feb 28, 2009 at 6:55 PM, Ramon Leon ramon.leon@allresnet.comwrote:
I was trying to guide the discussion back to Matthews proposal which I
agree with. I'm not suggesting they should adopt any image.
You misunderstand, I was agreeing with you, the comment about what was absurd wasn't in reply to you but to the previous suggestion that all these different forks could be convinced to come back to a core image; just isn't going to happen.
I think getting the various communities to buy in and commit to a vision of shared core packages should be the responsibility of the board. It should be in each communities interest to work on such an activity. This is where the real work will need to be done.
Again, I was agreeing with that position.
Andreas is the only person I've seen proposing anything at all pragmatic
for the board to do. SNIP
What isn't pragmatic about Matthew's proposal? I don't think he suggests that the board does this themselves - maybe I misunderstood.
I like his proposal, that doesn't make it pragmatic, at least, not as pragmatic as what Andreas is always saying, harvest what's done, rather than making big plans for doing stuff that the past has shown rarely works out.
One of Squeaks big problems, IMHO, is that every version is planned and realised by someone grand idea of features that should be in the next version. More often than not, this is vaporware, work someone wants to do, or thinks can be done, but isn't done, so everyone waits and waits and waits for a new version.
I'd much rather see releases done by specific dates, like a release every 6 months, each release harvesting the latest fixes and patches of accomplished work. Make it easier to get new stuff integrated into the core by releasing regularly and often rather than these pie in the sky visions of what might be. Guaranteed steady gradual evolution and continual progress beats the hell out of revolutionary grand changes that might or might not actually happen.
I don't think the board should be following and picking. I don't think that works. The sub communities should be playing an active role in this and I think it should be the boards responsibility to:
Convince each community that it's worth doing
Find out and agree what needs to be done
Enable each community to work
- Remove barriers
- Define processes
- Resources: tools, email lists, servers, etc
Manage the overall progress and lend support where necessary
Zulq
Which is what they've been doing, it's not been working so well. The Squeak community wouldn't be splintered into so many fragments otherwise. The Pharo guys forked because it's the only way to get anything done, it takes too long to try and get anything real done in the core Squeak, progress is glacial, so everyone forks.
Keith and Matthew are right, focus on packages and tools for sharing code, Andreas is right, focus on what's done, integrate it, make progress, stop the pie in the sky dreams of the ultimate system, or core kernel image, or whatever other pipe dream people keep chasing that clearly isn't being used by all the sub communities anyway.
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
I'm sure you can see there are significant benefits in building images form automatically constructed microkernels. - images are no longer tied to obsolete image formats and/or object memory layouts and restrictions (such as a single contiguous heap segment with no support for memory-mapped segments, giving memory back to the OS after GC, pinning objects for the FFI, etc, etc). - bytecode sets, compilation technology and VM technology can evolve and improve performance as ingenuity permits - platform integration can improve, e.g. allowing Squeak as a DLL
If the various forks maintain different basic images instead of different bootstraps (same end-point, different starting-points) then this just isn't practicable and performance won't improve.
So I hope you're wrong and that starting from kernel images isn't a pipe dream. Let me emphasisze I am *not* advocating a common development image, only a common microkernel from which all other images can be built, automatically.
I don't care what image I use, I care what packages I use. I care that my
Collections package is the latest and greatest, or that my Seaside package is up to date, or that I'm using the newest FFI, or the right Polymorph. I don't care about eToys or Graphics or 3D or anything to do with educating kids. Packages are more important than images. The only thing I want from an image is that it isn't bloated with a bunch of unloadable code like eToys that infects everything and breaks everything when you try and remove it.
Elliot is right about building images from scripts as well, I want to automate the builds of the images I use with a script to load onto a base image just the packages I want. Damien does this now with his dev images, a good starting point for me, then I use a script and Keiths installer to further load and customize it to my liking.
Anyway, now I'm just rambling, so I'll leave it at that. +1 for tools, Packages, and integrate what's done and get it out.
Ramon Leon http://onsmalltalk.com
On Sat, Feb 28, 2009 at 7:32 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
I'm not sure that's true. Say it becomes yet another fork, separate (necessarily, at first, because it's a different image format) from all of the other forks. As long as most packages can be loaded into it, it'll get used. Maybe not by the people doing the forking (by Scratch, say, or Squeakland), but by the majority of us who have a few pet packages (in my case, Seaside, OmniBrowser, DabbleDB, etc) that we can load into nearly any Squeak image and feel at home. I'm pretty happy to load those into a MinimalMorphic image this month, a Pharo image next month, and a Cog image the month after, if there's some compelling reason to do so - and 10x performance would certainly be compelling.
A shared microkernel would be nice, but I don't think it's essential in the short term to drive adoption of a new technology.
I'm not sure that's true. Say it becomes yet another fork, separate (necessarily, at first, because it's a different image format) from all of the other forks. As long as most packages can be loaded into it, it'll get used. Maybe not by the people doing the forking (by Scratch, say, or Squeakland), but by the majority of us who have a few pet packages (in my case, Seaside, OmniBrowser, DabbleDB, etc) that we can load into nearly any Squeak image and feel at home. I'm pretty happy to load those into a MinimalMorphic image this month, a Pharo image next month, and a Cog image the month after, if there's some compelling reason to do so - and 10x performance would certainly be compelling.
A shared microkernel would be nice, but I don't think it's essential in the short term to drive adoption of a new technology.
Ditto, as I said earlier, I care about my packages, not which squeak image is the base, but for a 10x bump in speed, I'd certainly take the time to port everything I use, if only for deployment. Getting everyone on a common kernel would take something really compelling them all to feel the same way, a 10x bump in speed *is certainly* compelling.
Ramon Leon http://onsmalltalk.com
2009/3/1 Avi Bryant avi@dabbledb.com:
On Sat, Feb 28, 2009 at 7:32 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
I'm not sure that's true. Say it becomes yet another fork, separate (necessarily, at first, because it's a different image format) from all of the other forks. As long as most packages can be loaded into it, it'll get used. Maybe not by the people doing the forking (by Scratch, say, or Squeakland), but by the majority of us who have a few pet packages (in my case, Seaside, OmniBrowser, DabbleDB, etc) that we can load into nearly any Squeak image and feel at home. I'm pretty happy to load those into a MinimalMorphic image this month, a Pharo image next month, and a Cog image the month after, if there's some compelling reason to do so - and 10x performance would certainly be compelling.
A shared microkernel would be nice, but I don't think it's essential in the short term to drive adoption of a new technology.
It is essential in a terms of allowing evolution of VM as a regular process, not as a single major blow which done once, then adopted and frozen for the next 10 years (or more). VM having a certain contracts with a language side, like object formats, bytecode set, special objects and their slot positions, core classes etc etc.. If you not follow the VM obligations on this - you can't declare your image/work as a safe, stable environment. And kernel image is the best way to ensure that all such contracts is fullfilled. It serves to identify all such contracts, represent them at language side and make them work. Without kernel image, we doomed to wander in the dark, guessing , what code needs to be changed to adopt new VM capabilities, and also keep VM developers care much more about backward compatibility (often by sacrificing the cool improvements, just because of fear that it will not be compatible).
On 01.03.2009, at 05:02, Avi Bryant wrote:
On Sat, Feb 28, 2009 at 7:32 PM, Eliot Miranda <eliot.miranda@gmail.com
wrote:
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
I'm not sure that's true. Say it becomes yet another fork, separate (necessarily, at first, because it's a different image format) from all of the other forks. As long as most packages can be loaded into it, it'll get used. Maybe not by the people doing the forking (by Scratch, say, or Squeakland), but by the majority of us who have a few pet packages (in my case, Seaside, OmniBrowser, DabbleDB, etc) that we can load into nearly any Squeak image and feel at home. I'm pretty happy to load those into a MinimalMorphic image this month, a Pharo image next month, and a Cog image the month after, if there's some compelling reason to do so - and 10x performance would certainly be compelling.
A shared microkernel would be nice, but I don't think it's essential in the short term to drive adoption of a new technology.
Agreed.
I'm still looking for ways to get Squeakland Etoys and squeak.org sharing resources as much as possible. We last cross-merged for the 3.8 release, which was a considerable manual effort, but worth it for both sides (etoys got many bug fixes, squeak.org got the m17n/unicode support). After that, some new packages were shared (Rome, DBus, GStreamer), but no reliable process for exchanging bug fixes exists - not because we do not want it, but because it's simply too much effort.
I'll support anyone who finds a way around this, a way to share between forks. Matthew's proposal sounds great and doable. I'll advocate starting to use Monticello for Etoys development - it's the only feasible way to share I see currently.
Sharing a microkernel image would be nice but I fear too much has changed in the class/metaclass kernel to make that feasible. But perhaps for using Cog, does it really require extensive image-level changes, or couldn't the SystemTracer be used to convert the image format?
- Bert -
Bert Freudenberg wrote:
Sharing a microkernel image would be nice but I fear too much has changed in the class/metaclass kernel to make that feasible. But perhaps for using Cog, does it really require extensive image-level changes, or couldn't the SystemTracer be used to convert the image format?
Actually, I would say that the SystemTracer is interesting exactly when you have extensive image changes that you want to do only once (say for generating the Squeak 5.0 image starting from the 4.3 one). Some smaller changes can be handled by doing the conversion while reading and saving the image. A good example of that in Cog would be the stack pages - you could create them when reading an old image and you can convert all contexts to old style ones (so the stack pages would be empty) before saving the image. A more radical change, like entirely different bytecode encoding, is easier to do just once with the SystemTracer.
This has been an interesting thread about what is possible technically as well as socially ("we could do that, but then the forks wouldn't adopt it"). My "Chunky Squeak" proposal (http://wiki.squeak.org/squeak/584) was meant to be an example of the smallest possible change that I felt would be acceptable for everybody. If people are feeling adventurous and would rather a really modular microkernel Squeak, I have worked on that as well as described in my "Neo Smalltalk modules" page (http://wiki.squeak.org/squeak/5637).
Any change to the image format is an extremely significant event in the history of Squeak, probably even worse than the 680x0->PowerPC, Mac OS 9->Mac OS X and PowerPC->Intel changes that the Macintosh community had to endure. I don't think it would be a good idea to do it in a series of steps (except in private development images, of course) but instead it would be better to gather all possible changes into a single 4.x->5.0 transition.
There are many little things I wouldn't mind seeing changed. I bet Bryce Kampjes would love to have a 0 tag represent SmallIntegers, for example. For SiliconSqueak (hardware implementation of the VM that I am developing) it would be convenient to have tagged floating point numbers, for another example. And I can't say that I am too happy with the very conservative path that 64 bit Squeak ended up taking.
If we agree that each sub-community will have its own images and will at most share Monitcello level packages, then none of the above is a big deal. Let each one do what they like best. But I prefer trying for a shared micro-image with as radical solution as possible, though not so radical that we leave behind the community.
-- Jecel
Eliot Miranda a écrit :
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
maybe a silly question (I have no idea of what is involved): would converting an image from the current format to the one your VM will require be an option ?
Stef
2009/3/1 Stéphane Rollandin lecteur@zogotounga.net:
Eliot Miranda a écrit :
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
maybe a silly question (I have no idea of what is involved): would converting an image from the current format to the one your VM will require be an option ?
converting an object format does not makes objects behave differently , right? Ask yourself, why you doing a lot of cleanup, throwing away obsolete stuff and replacing with better one in Pharo, and why we coudn't do much the same with VM? :)
Stef
maybe a silly question (I have no idea of what is involved): would converting an image from the current format to the one your VM will require be an option ?
converting an object format does not makes objects behave differently , right? Ask yourself, why you doing a lot of cleanup, throwing away obsolete stuff and replacing with better one in Pharo, and why we coudn't do much the same with VM? :)
Stef
hum.. I guess I don't know what I am talking about. BTW, I am not the Stef working in Pharo, I'm another Stef :)
regards,
Stef
2009/3/1 Stéphane Rollandin lecteur@zogotounga.net:
maybe a silly question (I have no idea of what is involved): would converting an image from the current format to the one your VM will require be an option ?
converting an object format does not makes objects behave differently , right? Ask yourself, why you doing a lot of cleanup, throwing away obsolete stuff and replacing with better one in Pharo, and why we coudn't do much the same with VM? :)
Stef
hum.. I guess I don't know what I am talking about. BTW, I am not the Stef working in Pharo, I'm another Stef :)
ah.. sorry :) Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before? This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
regards,
Stef
ah.. sorry :) Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before? This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
let me reformulate. what I asked really is: will there be some way to have existing images run on the coming Cog VM ? (quoting Eliot Miranda: "this VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes")
Stef
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be doing this all day long...
Cheers, - Andreas
2009/3/1 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be doing this all day long...
but we both know that this is too good to be true. :) unless you change the way how things working, you can't achieve significant performance boost. And often this means rewriting interfaces, which inevitably leads to changing a lot of code on language side etc.
Cheers, - Andreas
On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/1 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be
doing
this all day long...
but we both know that this is too good to be true. :) unless you change the way how things working, you can't achieve significant performance boost. And often this means rewriting interfaces, which inevitably leads to changing a lot of code on language side etc.
Uh, no. Here is the inline cache check in Cog, which is as complicated as it is because of compact classes:
00009588: movl %edx, %eax : 89 D0 0000958a: andl $0x00000001, %eax : 83 E0 01 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11 0000958f: movl %ds:(%edx), %eax : 8B 42 00 00009592: shrl $0x0a, %eax : C1 E8 0A 00009595: andl $0x0000007c, %eax : 83 E0 7C 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC 0000959d: andl $0xfffffffc, %eax : 83 E0 FC 000095a0: cmpl %ecx, %eax : 39 C8 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA
In VisualWorks the code looks like
movl %ebx, %eax andl $3, %eax jnz LCompare movl (%ebx), %eax LCompare: cmpl %eax, %edx jnz +0xffffff??=LSICMissCall
That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions in the common case, but vitally, for non-compact classes 2 memory reads vs one.
So indeed object representation can make a major difference in run-time performance. Consider how much quicker object allocation is in VW, which does not have to check if the receiving class is compact or not, compared to Squeak. Consider how much quicker string access is in VW, which has immediate characters, than Squeak with the character table and the inability to do == comparisons on Unicode characters. etc. etc.
Cheers,
- Andreas
-- Best regards, Igor Stasenko AKA sig.
Best Eliot
2009/3/1 Eliot Miranda eliot.miranda@gmail.com:
On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/1 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be doing this all day long...
but we both know that this is too good to be true. :) unless you change the way how things working, you can't achieve significant performance boost. And often this means rewriting interfaces, which inevitably leads to changing a lot of code on language side etc.
Uh, no. Here is the inline cache check in Cog, which is as complicated as it is because of compact classes: 00009588: movl %edx, %eax : 89 D0 0000958a: andl $0x00000001, %eax : 83 E0 01 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11 0000958f: movl %ds:(%edx), %eax : 8B 42 00 00009592: shrl $0x0a, %eax : C1 E8 0A 00009595: andl $0x0000007c, %eax : 83 E0 7C 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC 0000959d: andl $0xfffffffc, %eax : 83 E0 FC 000095a0: cmpl %ecx, %eax : 39 C8 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA In VisualWorks the code looks like movl %ebx, %eax andl $3, %eax jnz LCompare movl (%ebx), %eax LCompare: cmpl %eax, %edx jnz +0xffffff??=LSICMissCall That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions in the common case, but vitally, for non-compact classes 2 memory reads vs one. So indeed object representation can make a major difference in run-time performance. Consider how much quicker object allocation is in VW, which does not have to check if the receiving class is compact or not, compared to Squeak. Consider how much quicker string access is in VW, which has immediate characters, than Squeak with the character table and the inability to do == comparisons on Unicode characters. etc. etc.
Sometimes I having troubles with expressing my thoughts clearly.. sorry. I din't mean that changing object format does not improves the speed. I meant that such changes alone is very hard to adopt without ANY changes on language side. See Behavior>>becomeCompact becomeCompactSimplyAt: index becomeUncompact
see also #compactClassesArray
and i suspect this list is only a top of the iceberg (for instance, you may need to change SpaceTally to report things properly).
Now, defending you point, that really, it would be much easier to deal with such things in a micro-image (consider the amount of code and tests which you need to perform when producing new update).
This makes you, as a VM developer be responsible from good integration of VM with language side. Then rest images, which is based on it will have to use things strictly in manner, as it put in kernel. It is important to draw a line between kernel and rest of the code in image, which depends on it.
Cheers, - Andreas
-- Best regards, Igor Stasenko AKA sig.
Best Eliot
On Sun, Mar 1, 2009 at 1:01 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/1 Eliot Miranda eliot.miranda@gmail.com:
On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko siguctua@gmail.com
wrote:
2009/3/1 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What
is
the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be doing this all day long...
but we both know that this is too good to be true. :) unless you change the way how things working, you can't achieve significant performance boost. And often this means rewriting interfaces, which inevitably leads to changing a lot of code on language side etc.
Uh, no. Here is the inline cache check in Cog, which is as complicated
as
it is because of compact classes: 00009588: movl %edx, %eax : 89 D0 0000958a: andl $0x00000001, %eax : 83 E0 01 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11 0000958f: movl %ds:(%edx), %eax : 8B 42 00 00009592: shrl $0x0a, %eax : C1 E8 0A 00009595: andl $0x0000007c, %eax : 83 E0 7C 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC 0000959d: andl $0xfffffffc, %eax : 83 E0 FC 000095a0: cmpl %ecx, %eax : 39 C8 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA In VisualWorks the code looks like movl %ebx, %eax andl $3, %eax jnz LCompare movl (%ebx), %eax LCompare: cmpl %eax, %edx jnz +0xffffff??=LSICMissCall That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions in the common case, but vitally, for non-compact classes 2 memory reads vs
one.
So indeed object representation can make a major difference in run-time performance. Consider how much quicker object allocation is in VW, which does not have to check if the receiving class is compact or not, compared
to
Squeak. Consider how much quicker string access is in VW, which has immediate characters, than Squeak with the character table and the
inability
to do == comparisons on Unicode characters. etc. etc.
Sometimes I having troubles with expressing my thoughts clearly.. sorry.
No need to apologise. Everyone (me especially) can take a few mail messages to converge on the right meaning from time to time.
I din't mean that changing object format does not improves the speed.
I meant that such changes alone is very hard to adopt without ANY changes on language side. See Behavior>>becomeCompact becomeCompactSimplyAt: index becomeUncompact
see also #compactClassesArray
and i suspect this list is only a top of the iceberg (for instance, you may need to change SpaceTally to report things properly).
You're absolutely right. The major image-level change I will require is for Behavior to implement identityHash with a primitive that is different form that in Object. Doing this allows me to implement a hidden class table in the VM where a class's identity hash is the index into the class table. An instance of a class has the class's class table index (the class's id hash) stored in its header, not a direct pointer to the class. So every object has a more compact class reference, say 16, 20 or 24 bits. Also, class references in in-line and method caches are class indices, not direct class references, which means less work on GC. But to ensure a class can be entered in the table by the VM at an unused index Behaviour>>identityHash must be a special primitive that the VM implements as searching the table for an unused index.
But the imager-level code for compact classes can still exist; its just that the VM will ignore it :)
Of course you're right about SpaceTally. Perhaps the VM should provide some primitives that allow SpaceTally to be parameterised. of course it's not until one tries to use such a parameterised SpaceTally on more than one object representation that one knows the design works across a range of object representations. And its not as if one will be trying new object representations every week (although I don't know :) ). But it might be worth the effort. But probably simpler is having the kernel of SpaceTaly's computatioons be in the microkernel image, and writing some suitable tests to be run in a derivative image.
But my experience with 64-bit VW is that there are very few changes. We had the Behaviour>>identityHash primtiive, the primitive that answered the size of the hash field, and that's about it. Note that the image already computes the size of SmallInteger by doing subtractions until overflow at start-up.
Now, defending you point, that really, it would be much easier to deal
with such things in a micro-image (consider the amount of code and tests which you need to perform when producing new update).
This makes you, as a VM developer be responsible from good integration of VM with language side. Then rest images, which is based on it will have to use things strictly in manner, as it put in kernel. It is important to draw a line between kernel and rest of the code in image, which depends on it.
Right. Agreed. And experience shows (16-bit => 32-bit, Squeak & VW 64-bit) that the new constraints introduced by the microkernel will be very few and unobtrusive.
Best Eliot
Cheers,
- Andreas
-- Best regards, Igor Stasenko AKA sig.
Best Eliot
-- Best regards, Igor Stasenko AKA sig.
2009/3/1 Eliot Miranda eliot.miranda@gmail.com:
On Sun, Mar 1, 2009 at 1:01 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/1 Eliot Miranda eliot.miranda@gmail.com:
On Sun, Mar 1, 2009 at 12:23 PM, Igor Stasenko siguctua@gmail.com wrote:
2009/3/1 Andreas Raab andreas.raab@gmx.de:
Igor Stasenko wrote:
Changing the object formats alone does not gives any benefits. What is the point in having new format when you keep using old semantic model as before?
Speed. That is the only point of the exercise to begin with.
This is like swapping instance variables order in your class.. Apart from a better aestetical view it gives you nothing :)
If swapping ivars in a class would give me a 3x in performance I'd be doing this all day long...
but we both know that this is too good to be true. :) unless you change the way how things working, you can't achieve significant performance boost. And often this means rewriting interfaces, which inevitably leads to changing a lot of code on language side etc.
Uh, no. Here is the inline cache check in Cog, which is as complicated as it is because of compact classes: 00009588: movl %edx, %eax : 89 D0 0000958a: andl $0x00000001, %eax : 83 E0 01 0000958d: jnz .+0x00000011 (0x000095a0=singleRelease@40) : 75 11 0000958f: movl %ds:(%edx), %eax : 8B 42 00 00009592: shrl $0x0a, %eax : C1 E8 0A 00009595: andl $0x0000007c, %eax : 83 E0 7C 00009598: jnz .+0x00000006 (0x000095a0=singleRelease@40) : 75 06 0000959a: movl %ds:0xfffffffc(%edx), %eax : 8B 42 FC 0000959d: andl $0xfffffffc, %eax : 83 E0 FC 000095a0: cmpl %ecx, %eax : 39 C8 000095a2: jnz .+0xffffffda (0x0000957e=LSICMissCall) : 75 DA In VisualWorks the code looks like movl %ebx, %eax andl $3, %eax jnz LCompare movl (%ebx), %eax LCompare: cmpl %eax, %edx jnz +0xffffff??=LSICMissCall That's 9 or 11 instructions (compact vs non-compact) vs 6 instructions in the common case, but vitally, for non-compact classes 2 memory reads vs one. So indeed object representation can make a major difference in run-time performance. Consider how much quicker object allocation is in VW, which does not have to check if the receiving class is compact or not, compared to Squeak. Consider how much quicker string access is in VW, which has immediate characters, than Squeak with the character table and the inability to do == comparisons on Unicode characters. etc. etc.
Sometimes I having troubles with expressing my thoughts clearly.. sorry.
No need to apologise. Everyone (me especially) can take a few mail messages to converge on the right meaning from time to time.
I din't mean that changing object format does not improves the speed. I meant that such changes alone is very hard to adopt without ANY changes on language side. See Behavior>>becomeCompact becomeCompactSimplyAt: index becomeUncompact
see also #compactClassesArray
and i suspect this list is only a top of the iceberg (for instance, you may need to change SpaceTally to report things properly).
You're absolutely right. The major image-level change I will require is for Behavior to implement identityHash with a primitive that is different form that in Object. Doing this allows me to implement a hidden class table in the VM where a class's identity hash is the index into the class table. An instance of a class has the class's class table index (the class's id hash) stored in its header, not a direct pointer to the class. So every object has a more compact class reference, say 16, 20 or 24 bits. Also, class references in in-line and method caches are class indices, not direct class references, which means less work on GC. But to ensure a class can be entered in the table by the VM at an unused index Behaviour>>identityHash must be a special primitive that the VM implements as searching the table for an unused index. But the imager-level code for compact classes can still exist; its just that the VM will ignore it :) Of course you're right about SpaceTally. Perhaps the VM should provide some primitives that allow SpaceTally to be parameterised. of course it's not until one tries to use such a parameterised SpaceTally on more than one object representation that one knows the design works across a range of object representations. And its not as if one will be trying new object representations every week (although I don't know :) ). But it might be worth the effort. But probably simpler is having the kernel of SpaceTaly's computatioons be in the microkernel image, and writing some suitable tests to be run in a derivative image. But my experience with 64-bit VW is that there are very few changes. We had the Behaviour>>identityHash primtiive, the primitive that answered the size of the hash field, and that's about it. Note that the image already computes the size of SmallInteger by doing subtractions until overflow at start-up.
Now, defending you point, that really, it would be much easier to deal with such things in a micro-image (consider the amount of code and tests which you need to perform when producing new update).
This makes you, as a VM developer be responsible from good integration of VM with language side. Then rest images, which is based on it will have to use things strictly in manner, as it put in kernel. It is important to draw a line between kernel and rest of the code in image, which depends on it.
Right. Agreed. And experience shows (16-bit => 32-bit, Squeak & VW 64-bit) that the new constraints introduced by the microkernel will be very few and unobtrusive.
I don't agree with defining a kernel modularisation as a constraint :) The constraint is when you saying: hey pals, you can't have a headless image - use one with Morphic instead, and if you really really want it, we invented a workaround - a headless mode in VM. ;)
Best Eliot
On Sun, Mar 1, 2009 at 3:19 AM, Stéphane Rollandin lecteur@zogotounga.netwrote:
Eliot Miranda a écrit :
I need to point out that unless the various communities can start building their disparate and diverging images form a micro-kernel image I don't see how improved execution technology is going to be adopted by the community. I'm working hard on a VM that will be potentially 10x the current Squeak VM for Smalltalk intensive benchmarks. This VM will be source code compatible and bytecode compatible but likely it will not be image compatible as it will use a streamlined object representation that doesn't use compact classes. The only way I can see this being adopted by the community at large is if the community starts building images form microkernels.
maybe a silly question (I have no idea of what is involved): would converting an image from the current format to the one your VM will require be an option ?
One can convert images from one format to another using the SystemTracer. This is a program that from within an image traces all the objects it can find and writes out a new image. But its tricky. The system isn't usable while the thing is running and the system inevitably includes the SystemTracer which one must later strip out if one wants e.g. a minimal deployment image. So the SystemTracer approach isn't great. (It also suffers from an issue explained below).
One can convert images from one format to another using a program that reads an image, transforms it, and writes it, (I'll call this ImageRewriter) but this is tricky. The image may contain user code that has constraints the ImageRewriter isn't aware of. VisualWorks uses the ImageWriter approach to convert 32-bit images to 64bit images. It does succeed in rewriting the base development image but occasionally will fail to produce a working rewrite of some complicated image.
Even then the image that ImageRewriter produces still needs to contain special support and to be saved to be ready for production. One thing the image does for itself on startup is check if the size of the identity hash field has changed (in 64-bit VW images there is a larger id hash than in 32-bit VW images). If the image finds the id hash filed has changed it rehashes all hashed collections except MethodDictionary. The ImageRewriter (and SystemTracer) knows enough to be able to rehash MethodDictionary and IdentityDictionary. But because the default implementation of #hash in Object is to answer identityHash a change in id hash can potentially affect equal-hashed collections, not just id-hashed ones. But to be able to rehash an equal-hashed collection one must be able to evaluate #hash and #= and these are arbitrarily complex and it gets tricky to get either ImageRewriter or SystemTracer to rehash. Note that they have to compute what the hash would be in the new image, not what it evaluates to in the current image Hence it is much easier to have the new image rehash its non-MethodDictionary collections on start-up.
Clarly this is slow enough that one does it once when starting up the output of ImageRewriter, then saves. The saved image then starts up without needng to rehash because the id-hash size won't have changed.
So both SystemTracer and ImageRewriter approaches have significant difficulties when trying to produce images in which things lie the id hash has changed. They also have difficulties if the instruction set, class implementation, block implementation etc etc of the target has changed because it may be difficult to set-up the necessary invariants.
With the micro-kernel approach the real image is produced by loading code into the microkernel. So the image transformers (be they SystemTracer, ImageRewriter or MicroKernelGenerator) only have to function on the known quantity which is the microkernel image, not on an arbitrarily complex development or product image. Great. But if the microkernel image is simple enough then why not generate it directly from a source specification (as John Maloney's MicroSqueak does)? Instead of stripping code form an existing image, as one must do with both SystemTracer and ImageRewriter one produces just what the microkernel needs to contain, and it i exactly reproducible. In any image that tries to prodeuce a microkernel exactly the same microkernel will be produced whereas with the SystemTracer and ImageRewriter approaches what one gets depends on the image one starts with.
So I much prefer the microkernel approach (I didn't used to; enlightenment comes slowly if at all). It is not absolutely necessary but turns out to have significant advantages. The only downside (and I don't even think it is a downside) is needing to build up a development or production image by loading code into the microkernel (I actually think this is a feature :) ).
HTH
est Eliot
Stef
ok, most of this stuff flies high above my head, but thanks for the explanation :)
Stef
Le 28/02/09, Ramon Leon ramon.leon@allresnet.com écrivait :
Agree with all Ramon, Andréas, and Mathew said till now
My suggestion is that only an image which includes standard packages alone should be called a core image (and then only if one is needed? Would it be needed?). I'd reserve the term kernel image for the types of images Eliot and others are discussing.
- Zulq
All images discussed here have some kind of trouble. And yes, I agree on standard packages as several mentioned before. But again I ask going forward and think about a Class repository as I using for reloading all "missed classes" when some .morph or .pr coming of different fork is dragged and drop or selected via file list in SqueakLightII.
This Class repository could be easily produced for any fork. We could agree on the common ground between forks and could be easier fix or improve the code of only a Class and not of a bigger package.
And a script for generate a valid image could load from this Class repository , like others Smalltalks do
Edgar
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
2009/3/1 edgar De Cleene edgardec2001@yahoo.com.ar:
My suggestion is that only an image which includes standard packages alone should be called a core image (and then only if one is needed? Would it be needed?). I'd reserve the term kernel image for the types of images Eliot and others are discussing.
- Zulq
All images discussed here have some kind of trouble. And yes, I agree on standard packages as several mentioned before. But again I ask going forward and think about a Class repository as I using for reloading all "missed classes" when some .morph or .pr coming of different fork is dragged and drop or selected via file list in SqueakLightII.
This Class repository could be easily produced for any fork. We could agree on the common ground between forks and could be easier fix or improve the code of only a Class and not of a bigger package.
And a script for generate a valid image could load from this Class repository , like others Smalltalks do
So, in this repository a most simple brick is a class? How it could deal then with extension methods, or overrides? If you add code which handling them, eventually you will get similar thing to MC :) So, what the difference?
Edgar
Yahoo! Cocina Recetas prácticas y comida saludable http://ar.mujer.yahoo.com/cocina/
Hello everyone,
+1 to everything Andreas, and Matthew said.
To re-iterate this is exactly where we have been heading with the squeak 3.11 effort.
If you look at the goal we want to achieve, it is clear that there are two ways to approach it. 1) From the bits upwards as Elliot describes, and 2) From what we have, and are using in production, downwards, in a carefully planned, engineered effort.
Both ways require tools that work, ways for packaging things, defining packages, dependencies, and loading those modules and tools for generating the required result form the starting point.
We have several of these bottom up efforts on the go, Spoon, and Coke/Pepsi/fonc/Albert or whatever it is called today. For myself I am not a bits/bytes guy, not having to deal with that stuff is one reason I came to Smalltalk in the first place.
1) So... Spoon Spoon is an example of starting from the bottom, with a fresh view, a fresh look at these tools. One hopes that spoon will do all of these things superlatively, but I fear it will be a long time until I can use it for my commercial stuff.
2) 3.11/4.0+ = Tools that allow us to Build stuff test stuff, package stuff, and load stuff
The expectation is that with these tools in place everyone can benefit, whether it be myself evolving 3.10 on to 3.11 slowly, Matthew in the relicencing, or Edgar and Klaus with the kernel images.
If we all have the same tools, and compatible packaging, external to our treasured creations, we start to have some common ground, and the ability to share ideas.
We do not expect to get everyone to use the exact same kernel, but we think we can get folks sharing stuff. If they share tools like Monticello, then it ensures that there is some minimal level of compatibility preserved between the forks. They can only share stuff like Monticello, if they can load it and keep up with the most maintained versions.
Once the common tools are in place, then Matthews common core packages can actually happen. 3.11 may not change much in functionality, but it is planning to do things like splitting "System" category up into sub packages.
regards
Keith
On 2/28/09 9:28 PM, "Keith Hodges" keith_hodges@yahoo.co.uk wrote:
- From what we have,
and are using in production, downwards, in a carefully planned, engineered effort.
This was 3.10 and now my SqueakLightII, but seems instead of less code for deal with you prefer do more new code...
Edgar
On Sat, 28 Feb 2009 09:20:55 +0100, Matthew Fulmer wrote:
...
I would like the board to do the following project, and I can manage it:
By this time next year, every squeak distribution (squeak.org, Pharo, eToys, cobalt) will be running a standard version of the following three packages:
s/three/four/ - Magnitude
- Collections - Streams - Compiler We will also fix and close all of the issues on mantis relating to these packages
...
On Sat, Feb 28, 2009 at 9:20 PM, Matthew Fulmer tapplek@gmail.com wrote:
Squeak is a growing community with diverse needs. We have long outgrown the monolithic image left to us by our founders, Dan Ingalls and company.
Yes, but only the enlightened community members realise this. The rest of them still think Morphic was well designed.
2004: DPON, by Michael van der Gulik: A project to revive Henrik Godenryd's modularity framework abandoned in Squeak 3.3
Er... no it isn't! Namespaces/Packages is similar, but I'm redesigning stuff rather than reviving stuff.
We need to build things for those who would build better images themselves. Having many good images to choose from makes everybody happier. The only issue with the situation is that they are not always compatible. I believe this is the core issue that the board and the squeak release team needs to address.
I fully agree with this.
By this time next year, every squeak distribution (squeak.org, Pharo, eToys, cobalt) will be running a standard version of the following three packages:
- Collections
- Streams
- Compiler
- Kernel ?
Gulik.
squeak-dev@lists.squeakfoundation.org