As a long-time Debian and squeak user--though I haven't done much squeaking lately--I'd like to see the 2 get together. I have some late-night thoughts.
One possible analogy is with systems such as virus and spam scanners that get regular updates from "upstream" (that is, upstream from Debian), sometimes directly from upstream. The image, or possibly change sets might be seen as analogous.
By the way, Debian does actually package at lots of pieces from repositories, include those for Perl, Python, TeX, and R.
Another analogy is to packages that have significant amounts of non-verbal data: star charts, representations of the solar system, music, maps. And many package have at least some graphical material. The main concern with these I was aware of was that some were just too big.
Thomas's message (appended below) seems to raise several distinct concerns.
The first is auditability in the licensing sense. Debian needs to be sure that the license of everything is known, and that the licenses are compatible with DFSG (Debian Free Software guidelines), as well as being compatible with each other. At one level it seems to me this could be addressed by a license that says "everything in the image is under this license." At another level, software systems frequently turn out, on inspection, to have parts with licenses that are either unclear or unfree. By unclear I mean it's not clear what, if any license exists, and who claims ownership or copyright. Sometimes the system includes components with licenses that don't get along (BSD and GNU?). So I'm not sure if the blanket assertion is sufficient. So Debian might want to take a closer look at the parts.
At the level of detailed check, things are harder. One advantage of text is that you can start at the top and go to the bottom, and known you've examined it all. In an image, there's always the possibility that you'll miss something (e.g., a game shows a copyrighted image only after an obscure series of steps), or that parts (e.g., the Welcoming text, other help, or some components) really might have someone who could pop up and claim them.
In principle one could enumerate all the objects in an image; maybe that would provide some comfort. One can even imagine a tool the identifies all text and audio-video like objects and the context in which they occur.
I think it has always been pretty clear that stuff going into squeak was under a very permissive license, so that might allay some of the concerns.
A second concern behind the desire for "source" is that people be able to inspect, modify and adapt what they have. I think the focus on text may be a bit misplaced here; that is simply an easy way to realize that. I don't know that Debian policy, or the more fundamental documents like Debian's "social contract" require it. I believe it is explicitly acknowledged that the editability is the key concern. For example, a postscript file is text, but distributing only postscript is not considered OK: one needs to distribute the material that generated the postscript. Someone else brought up pdf's. I think in Debian the idea is that if you have binary package with pdf's, you can get the source package, and the source package will have the raw materials that made the pdf, and that one can modify to create your own pdf.
Generally, the squeak image is also the most natural form for modifying itself. In fact, the whole source vs binary package distinction clearly arose for compiled languages (though not all Debian packages consist of programs that get run through a compiler), and is an awkward fit for smalltalk.
I think there may be some difficulties getting non-smalltalkers to wrap their heads around this self-referential quality of an image. But the whole system was clearly designed to be inspectable and malleable, and so seems to me fundamentally consistent with the DFSG's concerns that people be able to understand and modify the software they use. (The self-referential quality is not unique: C compilers are written in C, and most higher level-languages systems consist significantly of code written in the higher-level language).
A final concern is with the differences between versions. First of all, I'm not sure where that's coming from: it doesn't seem as central to the core Debian principles. Of course, it does have practical significance. The last time I checked, most of the evolution of images consisted of taking in change sets, or change set like things, so that seems pretty auditable. I'm not exactly sure what the dominant tools for managing updates are now, so perhaps life is not so simple.
There are other considerations that I don't think were raised in the message from Thomas, but seem worth mentioning.
First, there seem to be a lot of kinds of images floating around, and being proposed: minimal, beginner, developer, web developer, MVC, Tweak, Morphic, ... Should they all be packaged? Just some or one? None, with the message "go and pick one"? (the last doesn't seem too friendly, but is one way around the concern with images).
Second, images alone are not a good way for existing users to get updates, for the simple reason that using a new one will mean tossing out everything you've done. (Yes, I know there are ways to move stuff between images, but keeping current with an update stream seems easier.) This is a bit different from the typical Debian package. If I get a new compiler, I generally don't lose my source code. Debian has gone to great pains to make it easy to keep your customizations when you update other packages (e.g., apache, spamassassin). The only analogous strategy I can see would be to have the squeak launcher script automatically export your changes to a new image, and that's got all kinds of problems.
Third, Debian divides software into a "main" section and "non-free" (or maybe other?). My understanding is that squeak is in main. Software can't go in main if it depends on stuff outside of main. This might be problematic if some kind of "get the image yourself" method is followed. However, this issue only affects what section of the archive the package is in.
Finally, I am not a Debian Developer or an expert on Debian policy. I am not even exactly awake! I know I've gone on a bit, but I hope it's some help.
It probably would be useful to take the to debian-devel soon.
Ross Boylan.
Yeah, I know: top-posting :)
On Wed, 2008-05-21 at 23:34 +0200, Bert Freudenberg wrote:
Etoys was being considered to get into Debian. Now it may be rejected, because an image file is not "transparent enough" (see below). It was suggested to discuss this issue on the debian-devel list.
Do any of you have ideas how to respond? Are there perhaps other Debian packages that have a similar issue of accountability?
And how hard would it actually be to bootstrap a fresh Squeak image from sources nowadays?
- Bert -
Begin forwarded message:
From: Thomas Viehmann tv@beamnet.de Date: 21. Mai 2008 23:06:38 MESZ To: "José L. Redrejo Rodríguez" jredrejo@edu.juntaextremadura.net Cc: Bert Freudenberg bert@freudenbergs.de, ftpmaster@debian.org, holger@layer-acht.org Subject: etoys_3.0.1916+svn132-1_amd64.changes (almost) REJECTED Reply-To: ftpmaster@debian.org
(OK, for technical reasons, this is not the REJECT, but there is little point in delaying this mail now that I have written it.)
Hi José, Bert, Holger,
this is, unfortunately, the REJECT of etoys. First of all, thanks Bert, Holger, José for the discussion of some of the concepts. However, I am afraid that there are some fundamental concerns that have not been eliminated (yet). As such I would like to invite you to start a discussion on the packaging of squeak session images on debian-devel@lists.debian.org. Feel free to forward this mail if you consider it useful as a starting point.
It seems to me that the method of distributing VM sessions in .image files as the preferred form of modification does not match too well with Debian practices of compiling packages from source and having easy access to the differences between various versions of a package.
So as far as I understand it it seems like a typical squeak image cannot be bootstrapped[1] from (textual) source and that the typical mode of operation is to modify some known image and distribute the result. As such, the preferred form of modification is indeed the image file.
This, in my opinion, raises at least the following questions that seem fundamental to me:
- How easy should it be to figure out what is in an image?
While the source code to any class seems to be available, the image is more than that. Does that matter? Should source of Debian packages be auditable and is etoys currently auditable easily enough?
- Does Debian (including the various teams that might have to take
a look at your packages) want to have easy access to the differences between upstream version, one Debian revision and another? Do squeak session images provide this in a way that is acceptable to Debian?
From the squeak wiki pages and your explanations it seems that what I would consider at least partial solutions exist, but it seems that either I am still lacking understanding of important concepts or that the etoys packaging (Debian and maybe also upstream) could possibly be made a bit more transparent. Of course, we might find out that my difficulties with the perspective of squeak images in Debian originate in misconceptions of Debian packaging and maintenance that I may have. Either way, I would appreciate if we could discuss this with the Debian development public at large and draw on their additional expertise.
Kind regards
Thomas
-- Thomas Viehmann, http://thomas.viehmann.net/