Re: Modules

23 Feb 2005


      Hi, Dan,
I would love to be part of your team. And I am interested in working
on all the (not for long) future technologies.
About modules and packages and other related code organization structures:
I think I should start by mentioning the extent and limitations of
what I consider my relevant (to this subject) experience: I have had
extensive, as well as low-level experience with Envy, I have also
built additional organizational structures and workflow tools on top
of it, I have ported code from Envy to Store and I have worked with
Store. I have also worked with several Java IDEs and SCMs.
I have also worked on a creating a minimal VW image, an experiment
similar to Craig's Spoon.
And now, my view of the world, in (not so) short:
What I like about VW parcels/packages: they encapsulate in a robust
(from a loading perspective) form an independent piece of
functionality. Very Smalltakish in the sense that they allow partial
loading: if it contains methods of a class that is missing, no
problem, the parcel holds unto the uninstalled methods, and when/if
the missing class is loaded, it installs the methods. This could be
extended even to missing superclasses, which would make them
practically load-order independent. They also have the notion of
(stackable) overrides, so if in your package you change a method from
a different package, you can easily browse both the override and the
overriden code, but most importantly you can safely unload your
package and things are restored properly.
One thing that you don't have with load-order independece is
dependency information. Your parcel/package may load but you have no
clue if it will run. Of course, one could manually check Undeclared,
look for unimplemented but sent selectors, etc, but I think we could
offer more tools support for dependecy management. Envy does not allow
out-of order loads, and the applications' (Envy's packages)
prerequisites information is enforced only for superclass-subclasses
and class-extensions relationships (it does not allow you to subclass
or extend in an application where to the original class is not visible
(through an explicit dependency declaration)). IMHO dependency
information is useful, but it should not stop your code-writing
workflow, nor should it stop you from loading partially. A potential
solution (for also having dependency information) would be to compute
(as extensively as reasonably possible) and store the dependecies at
freezing/versioning time. Since this is a best effort solution, I
don't think this should require that packages that you are dependent
on to be also frozen/versioned. Dependency information could be stored
as "version x" if the dependency is a version, or "version x+" if it's
been modified since the last time it was versioned (as x). The base
image is obviously versioned as well.
It is perhaps obvious by now that I consider versioning an important
feature of any organizational structure. A version is a shareable,
immutable snapshot, at a finer granularity than the whole image. For
VW the granularity is the package, for Envy it is the class. If one
ignores the explicit and named aspects of versioning, Envy's
versioning granularity is actually at the method level. Each Envy
method "edition" is an immutable, timestamped snapshot of the method.
They are created automatically at each "accept", which is what makes
them impractical for remote servers. This is the main argument in
Store for their much coarser granularity (less chat), but I think this
is the wrong approach: latency can be addressed for example with
background processing, and frequency can be decreased by making method
versioning a separate, explicit user action. I think that in general,
the finer granularity the better. If one recognizes that not every
accept deserves a method version (obviously one could not have tested
a method at accept time) and each change is logged locally anyway, we
can grant methods their own, named versions, which may be explicitely
pushed to the respository, and which can later come very handy when
browsing the history. All methods have to be versioned when the class
(extension) is versioned, and this can be made automatically, just
like classes have to be versioned when the package is versioned. In
addition, the versions of methods that belong to a class version can
be marked as special when browsing the method version history, just
like class versions that belong to a package version can be marked
specially. Sorry for the perhaps too low-level details, I just wanted
to write things down. And Dan did ask us what we wanted to see in such
a system :)
Now, method versions are not interesting just for themselves. A
different kind of code organization is a patch, or a unit of work
which happens after the packaging structure has been defined, and is
perhaps cross-cutting through many different packages. This is a
changeset in a non-packaged, non-versioned, and limited-collaboration
universe. In a versioned, packaged world, the changesets themselves
should be versioned entities, and be composed of versioned
sub-entities. It is especially for changesets that method-level
versioning comes in handy, because here the finer, method-level
granularity is needed. If you are forced to create new (entire) class
versions for inclusion in a versioned patch, this not only adds noise,
but it creates a much higher incidence of merging conflicts.
As far as namespaces go, the problem to be solved seems much easier,
and I think everybody agrees that a heavy-weight solution like in VW
is inappropriate. I profoundly disliked in VW the fact that they
namespaced the base image in a lot of small, meaningless namespaces
(although there was no name conflict to be solved), just as a display
of what could be done with them. I disliked the fact that namespaces
were made into as first-class objects as classes, to my mind without
the same conceptual justification. I also profoundly disliked that
they now had both namespaces (for the rest of us) and namescopes (for
the compiler) as two very similar (and with very similar
responsibilities) class hierachies, but yet distinct.
I think that the name lookup rules for the compiler should be the same
as the ones for our code, and I think that the base image should
contain no namespace other than Smalltalk. There is indeed potential
for name conflicts when independently developed packages are put
together in the same image. But if we are only trying to solve this
issue, and we don't mix it with categorizations (which should be done
by packges), I would think that a simple rule like "all external
applications' classes and packages should live in their own (only one)
external namespace" should be sufficient. This could easily happen
automatically, with a prompt for an image-wide development namespace
at the first class creation (like the initials prompt for the
changeset). Each corporation would have their own namespace and they
would do all of their development in it. And a few priviledged among
us like Dan (and maintainers of what is accepted as part of the base)
would always just type "Smalltalk", and that would be it :)
About image construction versus image stripping, I think we should be
able to do both. The image is a very powerful instrument for
development and I want to keep using it. By using it I will
necessarily dirty it, but I don't care as long as it is still a work
in progress. When I think I am done, I want to be able to extract the
application from the image, and I want tools to help me with that.
Once extracted, applications can be used to build clean images.
Please note that, while packages help with keeping things organized,
they are not a solution for code rot, and a long-lived application
will be touched by many hands, not all of them informed or competent
or careful. There will always be a case for stripping. There is a case
and a market for application extraction in Java, which does not have
images at all, so stripping is not so much about the image itself.
Building from scratch is useful, but it only eliminates unpackaged
things, it does not deal with packaged dead code
Cheers,
Florin