RE: [ENH][Modules] Delta Modules [was: Another version]

4 Oct 2001

      Adrew Black raises various concerns about the specification of delta 
modules that I largely share.
In my experience, one of the concepts that many Smalltalk programmers have 
difficulty accepting is that it is a good thing to distinguish between the 
external artifacts that define a functional unit of a program and the 
actual executable instantiation of that functional unit. (with apologies to 
Korzybski, the class definition not the class).  I think the delta module 
discussion/specification suffers from the problem of blurring the identify 
of the specification with the identify of the specified entity.
Consider a complex, deployed class named Foo (I could talk about "modules" 
but I think everybody on this list has a common understanding of the word 
"class".  I don't think this extends to the word "module").
The Foo might be defined using four distinct units of code:
         BaseClass + Extension1 + Extension2 +Extension3
where:
         BaseClass = Implementation of ANSI Smalltalk specification of Foo
         Extension1 = Essential Squeak extensions to ANSI functionality
         Extension2 = MVC related methods
         Extension3 = Morphic related methods
It makes sense to manage these four pieces of code separately because in 
different situations we may want different compositions of the code 
units.  None of the above code fragments are actually the "class" Foo, they 
are simply specifications of portions of various potential classes each 
named Foo.  Practically speaking, there are probably five real classes 
that, depending on the usage situation, might be instantiated from these 
fragments:
         Foo = BaseClass
         Foo = BaseClass+Extension1
         Foo = BaseClass+Extension1+Extension2
         Foo = BaseClass+Extension1+Extension3
         Foo = BaseClass+Extension1+Extension2+Extension3
Regardless of which combination is used in a given situation, semantically 
we only have a single class Foo that is actually instantiated in any 
particular image (or single namespace). From the perspective of executing 
code, we don't care how that class was specified or constructed.  From a 
code management perspective, it should be clear than an extension is not a 
class.
By analogy we should be able to reason similarly about modules and delta 
modules.
What confuses me is distinction (if any) between "modules" as runtime 
semantic elements of the language (namespaces?) and "modules" as the 
external specifications of fragments (components?) that can be dynamically 
composed to make complete semantic entities.  If the word "module" means 
the semantic entity that corresponds to a populated namespace then it would 
make no sense to say that an extension is a module, for it clearly 
isn't.  If "module" means an external specification of a program fragment 
then an extension could arguably be called a kind of "module".
In my experience, it is very important to clearly make this distinction 
between the entity and its specification and to use distinct vocabulary 
when discussing them.  When this isn't done, we often get a confused 
burring of two very different types of abstractions into a more complex 
multi-role abstraction. Simplicity seems to be a watch word of the Squeak 
community. Having two simple distinct abstractions is often simpler than 
having one larger multi-role abstraction.
Allen
At 10:55 AM 10/4/2001 +0200, Andrew P. Black wrote:
...
Dear Andreas and Henrik,
Forgive me for pushing on this, but the more you explain, the more 
confused I become.
I have read the discussion between Andreas and Stéphane, and certainly 
agree that when one is making an extension to a complex system, one has to 
do two things, which are rather different from each other:
    (1) Define some new stuff.  This is mostly classes, but might 

also include globals.  Examples might be the classes that make up the 
HTML manipulation and representation code, or the 3D engine.
    (2) Connect one's new stuff to the old stuff.  This is mostly 

adding methods to existing classes, but will also include changing at 
least one method somewhere, or else the new stuff will never be used 
(other than from a DoIt).  Examples of this are Andreas's 
Form>>displayInterpolatedOn:, which should change when the 3D engine is 
available, and String>>asUrl.
Now, as I read Andreas' reply, the intention is that Modules are for (1) 
and DeltaModules are for (2).   In other words, that this difference in 
usage, or policy, should be reflected by a difference in _mechanism_.  Is 
this right?
And Henrik's reply is that the difference in mechanism is that Modules 
define a bunch of stuff absolutely, while DeltaModules define a bunch of 
stuff by difference.  Is this right?
Now, I have spent a lot of time in the Operating Systems community, where 
one of the principles that we value highly is the separation of policy 
from mechanism.  We have found through hard experience that the best thing 
for the Operating System designer to do is to provide a small number of 
simple and powerful mechanisms, and let the people who use them come up 
with various policies for, e.g., the allocation of resources, that use 
these mechanisms in different ways.
The alternative approach, which seems at first to be attractive, is for 
the system designer to guess what policies the application writer will 
want, and "build them in".   This turns out not to work, and you can 
probably guess the reasons that it doesn't work.   First, the policies are 
a lot more complex than the mechanisms, so more complexity (and more bugs) 
gets pushed down into the lower levels of the system.   Second, the system 
designers are not very good at understanding what the application 
developers need to do, so the policies aren't quite right, but because 
they are "built in", and complex, they are very hard to change.  And, 
thirdly, the application developers will insist on trying to implement new 
kinds of application, including things that the system developers had 
never thought of, and for which those built-in mechanisms are inappropriate.
I think that Smalltalk has also, for 20 years, adopted a policy-mechanism 
separation, and this has contributed greatly to its success, power and 
longevity.  Once example of this is inheritance, which is a mechanism that 
can be used in many different ways.  There is no language syntax that 
distinguishes between an abstract and a concrete superclass: these are 
indeed two different policies (uses of the class definition mechanism), 
but there is just one mechanism. Another example is the self-describing 
"chunk" mechanism used for file in/out, which can be used to build and 
read both class fileOuts and changeSets.  The policy/mechanism separation 
is a good rule of thumb, and usually gives one a simple powerful system 
that can be used in a multitude of different ways, including many that 
have not yet been thought of.
Of course, policy/mechanism separation is only a rule of thumb, and there 
may be a reason to break this rule.  But I believe that we should 
understand that reason fully, and should think hard, before we choose to do so.
It seems to me that the distinction between Module and DeltaModule, at 
least as Andreas describes it, is breaking the policy/mechanism 
separation.   I am not (yet) saying that this is wrong.  But I _am_ asking 
pointedly "Why?"   What is the compelling reason that makes you want to 
break the rule?  And I am not getting an answer to that question.
Now, I had assumed up until this morning that a Module could contain not 
only whole classes, but also "class extensions", that is, groups of 
methods that could be added to existing classes, even though those classes 
might be defined in different modules.  (The String>>asUrl example again, 
which I had assumed could be part of the HTML module). Now I see that I 
might be wrong about this, and that Andreas' point of view would say that 
Modules shall contain _only_ whole classes, and DeltaModules shall contain 
_only_ class extensions and perhaps class retractions (deleting 
methods).  In spite of the many pages now on the Wiki, I can find only one 
paragraph that describes what is _in_ a Module:
From http://minnow.cc.gatech.edu/squeak/2069 :
...
A module

may contain zero or more submodules.
may also take zero or more other modules as "input parameters".
will of course also have contents proper: classes, globals, etc.

I notice that "proper contents" does not include methods.  Is this 
intentional?  Is this a mechanism restriction that is intended to make it 
hard for me to put the "wrong stuff" in my module, that is, to put in 
"loose methods" that change some other classes?  In other words, are you 
trying to enforce good behaviour on the part of the programmer by 
restricting the available mechanisms?  If so, I admire your good 
intentions, but I think that you are misguided.  Bad programmers are just 
too ingenious!
Since Modules can contain sub-modules, and some of those sub-Modules can 
be DeltaModules, then it seems that in any case you have not achieved 
anything with this restriction.  I just put my "loose methods" (like 
String >> asUrl) into a DeltaModule inside my Module, and the unsuspecting 
user who imports my module is still surprised that it "damages" other 
classes like String, or Form, or whatever.
I think that we should indeed do what Henrik writes on the Wiki: build a 
"minimal and yet powerful semantic model for a Squeak Modules system".  I 
don't think that this policy separation between defining new and modifying 
old need be part of that model.  I think that this distinction, and others 
that we have not yet thought of, will be displayed by the modules that we 
write, just as the Abstract/Concrete class distinction is displayed by the 
classes that we write.  But I have yet to hear an argument for putting it 
in the model.
    Andrew