If a class is still referenced in code, then removed, and later filed back in, the former references do not point to the newly filed-in class, but to the obsolete former class.
This breaks (at least) the unloading and re-loading of Monticello packages. You have to manually recompile all methods that reference the class.
Would moving a class to Undeclared be a good fix for that? Or do we need a weak registry for removed classes to be able to efficiently fix up the references later?
Btw, there are two methods for detecting this situation:
SystemNavigation default browseObsoleteMethodReferences
SystemNavigation default browseObsoleteReferences
One is much faster than the other, both have identical results for me. Do we need both?
- Bert -
Am 12.02.2006 um 14:55 schrieb Bert Freudenberg:
If a class is still referenced in code, then removed, and later filed back in, the former references do not point to the newly filed-in class, but to the obsolete former class.
On a related note: The PointerFinder (aka 'chase pointers' menu item) does not find these references. I just submitted a fix:
http://bugs.impara.de/view.php?id=2715
Hopefully now the PointerFinder will never ever return with empty hands again ;-)
- Bert -
Le Dimanche 12 Février 2006 14:55, Bert Freudenberg a écrit :
If a class is still referenced in code, then removed, and later filed back in, the former references do not point to the newly filed-in class, but to the obsolete former class.
This breaks (at least) the unloading and re-loading of Monticello packages. You have to manually recompile all methods that reference the class.
Would moving a class to Undeclared be a good fix for that? Or do we need a weak registry for removed classes to be able to efficiently fix up the references later?
- Bert -
That's what is done in VW. Removed classes are moved to Undeclared (in fact, this is so any removed entry from any name space, not only Smalltalk). In VW, the association value is nilled out when moved to Undeclared.
If you want the association preserved, inverse move operation must be done when loading a class, Undeclared keys must be checked and association transferred to the SmalltalkEnvironment.
I think this would solve most of your Monticello problem.
But this could also have tricky results in a future (?) multi-namespace environment. Suppose NameSpace A has a class C. NameSpace B also has a class C.
If you remove both classes A.C and B.C, then load class C again what happen ? Should every namespace have its own Undeclared ? Or should Undeclared have several times the same key (not a Dictionary anymore, but a Set of Associations...) ? In VW, i do not know how they handle this case...
Without namespaces, there might be over tricks, because Undeclared can also contains reference to a remove inst var or class var...
The idea of making the whole Undeclared dictionary weak sound a good idea to me. You would have non referenced entry garbage automatically. What do you think of that ?
Am 12.02.2006 um 19:00 schrieb nicolas cellier:
Le Dimanche 12 Février 2006 14:55, Bert Freudenberg a écrit :
If a class is still referenced in code, then removed, and later filed back in, the former references do not point to the newly filed-in class, but to the obsolete former class.
This breaks (at least) the unloading and re-loading of Monticello packages. You have to manually recompile all methods that reference the class.
Would moving a class to Undeclared be a good fix for that? Or do we need a weak registry for removed classes to be able to efficiently fix up the references later?
- Bert -
the association value is nilled out when moved to Undeclared.
Ah, that's a good solution. No need to hold onto the obsolete classes, or is there?
Maybe if we want to migrate instances later?
If you want the association preserved, inverse move operation must be done when loading a class, Undeclared keys must be checked and association transferred to the SmalltalkEnvironment.
This already happens in the file-in code.
But this could also have tricky results in a future (?) multi- namespace environment. Suppose NameSpace A has a class C. NameSpace B also has a class C.
If you remove both classes A.C and B.C, then load class C again what happen ?
Nothing - the latest proposal for Squeak namespaces was to just use #A::C and #B::C as a class name. Nice and clean.
The idea of making the whole Undeclared dictionary weak sound a good idea to me. You would have non referenced entry garbage automatically. What do you think of that ?
Sounds good to me ... unless there is code which relies on temporarily moving stuff to Undeclared. I wouldn't rule that one out ;-)
- Bert -
Bert Freudenberg bert@impara.de wrote: [SNIP]
But this could also have tricky results in a future (?) multi- namespace environment. Suppose NameSpace A has a class C. NameSpace B also has a class C.
If you remove both classes A.C and B.C, then load class C again what happen ?
Nothing - the latest proposal for Squeak namespaces was to just use #A::C and #B::C as a class name. Nice and clean.
Perhaps I should dust off that code once again. I still feel it had lots of nice properties (simple, backwards compatible, no need for new fileout formats, tools still work etc).
regards, Göran
Am 13.02.2006 um 09:06 schrieb goran@krampe.se:
Bert Freudenberg bert@impara.de wrote: [SNIP]
But this could also have tricky results in a future (?) multi- namespace environment. Suppose NameSpace A has a class C. NameSpace B also has a class C.
If you remove both classes A.C and B.C, then load class C again what happen ?
Nothing - the latest proposal for Squeak namespaces was to just use #A::C and #B::C as a class name. Nice and clean.
Perhaps I should dust off that code once again. I still feel it had lots of nice properties (simple, backwards compatible, no need for new fileout formats, tools still work etc).
Please do :)
- Bert -
I've been playing around with a new VM (heh, heh) which, for a while, happened not to intern (ie force unique instances of) SmallIntegers. In this case the use of == to mean arithmetic equality will not work properly. In my opinion, all such occurrences in the system should be eliminated ASAP; == is not an arithmetic compare in any Smalltalk I know of. While it may work with small constants, it is simply wrong, and an especially bad example for newbies to see. Besides failing in certain interpreters, it will fail in Squeak itself if the integers are not small.
I regret that I don't have time to fix these right now. However, if there is a well-intentioned soul out there, he or she will perhaps find the method below to be quite useful. It found 165 methods in my system with this pattern.
Hope this helps.
- Dan -----------------------------------------------
<CompiledMethod>scanForEqSmallConstant "Answer whether the receiver contains the pattern <expression> == <constant>, where constant is -1, 0, 1, or 2..."
| scanner | scanner _ InstructionStream on: self. ^ scanner scanFor: [:instr | (instr between: 116 and: 119) and: [scanner followingByte = 198]]
" SystemNavigation new browseAllSelect: [:m | m scanForEqSmallConstant] "
On 13.02.2006, at 18:37, Dan Ingalls wrote:
I've been playing around with a new VM (heh, heh) which, for a while, happened not to intern (ie force unique instances of) SmallIntegers. In this case the use of == to mean arithmetic equality will not work properly. In my opinion, all such occurrences in the system should be eliminated ASAP; == is not an arithmetic compare in any Smalltalk I know of. While it may work with small constants, it is simply wrong, and an especially bad example for newbies to see. Besides failing in certain interpreters, it will fail in Squeak itself if the integers are not small.
I regret that I don't have time to fix these right now. However, if there is a well-intentioned soul out there, he or she will perhaps find the method below to be quite useful. It found 165 methods in my system with this pattern.
The interesting thing is that a quite large percentage of those come from the beloved
someCollection size == 0 ifTrue: []
pattern that lots of people like so much... "calling isEmpty is too slow" they will tell you, (and ifEmpty: is *really* evil). As the main objective is speed, they of course don't use #=.
For the newbies: Do not optimize for speed before you have proven that it makes sense, and then *document* the hack. Using a hack by default because is "may be too slow" is not a good idea...
Marcus
On 13.02.2006, at 18:37, Dan Ingalls wrote:
I've been playing around with a new VM (heh, heh) which, for a while, happened not to intern (ie force unique instances of) SmallIntegers. In this case the use of == to mean arithmetic equality will not work properly. In my opinion, all such occurrences in the system should be eliminated ASAP; == is not an arithmetic compare in any Smalltalk I know of. While it may work with small constants, it is simply wrong, and an especially bad example for newbies to see. Besides failing in certain interpreters, it will fail in Squeak itself if the integers are not small.
I regret that I don't have time to fix these right now. However, if there is a well-intentioned soul out there, he or she will perhaps find the method below to be quite useful. It found 165 methods in my system with this pattern.
The interesting thing is that a quite large percentage of those come from the beloved
someCollection size == 0 ifTrue: []
pattern that lots of people like so much... "calling isEmpty is too slow" they will tell you, (and ifEmpty: is *really* evil). As the main objective is speed, they of course don't use #=.
I take issue with the "of course" here. I defy anyone to demonstrate a significant (even detectable) speedup of == over = between SmallIntegers on any meaningful benchmark.
For the newbies: Do not optimize for speed before you have proven that it makes sense, and then *document* the hack. Using a hack by default because is "may be too slow" is not a good idea... Marcus
And for the "pros": Do not optimize for speed before you have proven that it makes sense, and then *document* the hack. Using a hack by default because is "may be too slow" is not a good idea... ;-) Dan
On 13-Feb-06, at 1:40 PM, Dan Ingalls wrote:
For the newbies: Do not optimize for speed before you have proven that it makes sense, and then *document* the hack. Using a hack by default because is "may be too slow" is not a good idea... Marcus
And for the "pros": Do not optimize for speed before you have proven that it makes sense, and then *document* the hack. Using a hack by default because is "may be too slow" is not a good idea...
Or more succinctly On Optimisation:- a) Don't b) (For experts only) Don't, Yet
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: OI: Vey
Le Lundi 13 Février 2006 22:40, Dan Ingalls a écrit :
I take issue with the "of course" here. I defy anyone to demonstrate a significant (even detectable) speedup of == over = between SmallIntegers on any meaningful benchmark.
Just to confirm Dan says, i did this in VW 7.3 and squeak 3.8
| t1 t2 t3 | t1 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) isEmpty]]. t2 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size = 0]]. t3 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size == 0]]. ^Array with: t1 with: t2 with: t3
VW: #(627 311 291) #(693 302 292)
Squeak: #(6619 3959 4146) #(6558 3976 4126)
If such optimization matter, should be at the VM level... need a guru for JIT, method cache or something we have not yet... but don't bother too much at upper level.
nicolas cellier ncellier@ifrance.com wrote: ...
| t1 t2 t3 | t1 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) isEmpty]]. t2 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size = 0]]. t3 := Time millisecondsToRun: [10000000 timesRepeat: [#(1 2 3 4) size == 0]]. ^Array with: t1 with: t2 with: t3
VW: #(627 311 291) #(693 302 292)
Squeak: #(6619 3959 4146) #(6558 3976 4126)
If such optimization matter, should be at the VM level... need a guru for JIT, method cache or something we have not yet... but don't bother too much at upper level.
In Squeak 3.8, #== seems to be even slower than #=. The same here: #(4941 3081 3101). But in Squeak 1.18, I've got #(8977 5313 4923), both with 3.7.7 Unix-VM resp. 1.18 Unix-VM.
Greetings and thank you, tim, for your convincing answer.
Wolfgang Helbig
Hi dan
this is really nice to see coming back to real life :). Can you tell us a bit more about this new VM? How does it relate to pepsi?
Stef
PS: I still have the dream that I could learn VM by reading the code of a nice one and then do a lecture on that, so your remark on newbie learning is always what push me to clean the system (remembering your Byte'81 quote on a so simple system that a single person could understand it)
PSPS note that I use simple and not small because cool abstractions really build simple knowledge :)
stéphane ducasse ducasse@iam.unibe.ch wrote...
this is really nice to see coming back to real life :). Can you tell us a bit more about this new VM? How does it relate to pepsi?
This one doesn't relate to pepsi (for folks not aware of it, "pepsi" refers to one of Ian Piumarta's designs for a small dynamic kernel), but I hope that it has prepared me to do some related experiments. As you know, I am at Sun now where "curly brace" languages are still the rule, but I want to keep experimenting with dynamic systems, thin clients, and the like.
I got inspired by a project that Helge Horch did, reviving an old St-78 image on top of Java (not the one he showed at ESUG). The thing actually runs very nicely (much faster than the original Notetaker), and fits in a .jar file that is under 200k including both image and interpreter. I think Helge will release this in the not-too-distant future, but he likes things to be really right, and he's currently over-busy, so it may be a few months yet. It is a gem, both for this compactness and performance, and also because, language-wise, it is a living Smalltalk-76.
Helge preserved all the 16-bit oops, object table, and reference-counting aspects of St-78 VM, but I was inspired by the performance and simplicity of riding on top of Java. It blew my mind that with this modest attachment, you could have Smalltalk live on a web page.
So, in order to learn Java, and the Java development environments, I am writing a Squeak in Java. The image I am using is the Mini2.1 -- the one with browser, debugger and decompiler with temp names, all in 600k. The interpreter runs now (15,000 bytecodes executed), and I am currently working on BitBlt (the story of my life ;-). The interpreter is not just a transcription of the Squeak reference interpreter. Instead (like Helge's), it uses Java objects for the objects, and thus can let Java take care of all the storage management. I've figured out a way to preserve enumeration and mutation, so it ought to be a pretty compatible implementation when it's done.
I don't know where this will go. I think it would be fun to take that base and strip it down to just the kernel and then hook it up to some modern web-based graphics, network and database APIs. For now my goal is to get the original artifact going. Then, hopefully we could all have fun taking it in other directions.
- Dan
Dan Ingalls wrote:
I don't know where this will go. I think it would be fun to take
It sounds like very, very interesting work! And I think everybody now is just dying to see Helge's stuff! :-)
that base and strip it down to just the kernel and then hook it up to some modern web-based graphics, network and database APIs. For now my goal is to get the original artifact going. Then, hopefully we could all have fun taking it in other directions.
Hmm, so we can have sqax*? ;-)
Michael
* instead of ajax
Hi -
I've got my little Squeak in Java running (hope to send out a link soon), and I've been pondering how to make it run faster. In the process, I've thought of two techniques, one of which is new (to me) and the other occurred to me years ago, but I never tried it out.
Since neither would really be all that hard to do in Squeak, I thought I'd mention them here for those folks who delight in such things, and with the further hope that someone might actually try them out.
Lazy Activation This was the next thing I was going to do for Apple Smalltalk back when I got drafted to the hotel business back in 1987. The essence of the idea is that the purpose of activating a context is to save the execution state in case you have to do a send and, conversely, you don't really need an activation if you never need to do a real send.
I had a lot of fun instrumenting the VM to figure out just how many activations could be avoided in this way, and my recollection is that it was roughly 50%. I believe the statistics were better dynamically than statically, because there are a lot of methods that, in general need to be activated, but they may begin with a test such as position > limit ifTrue: [^ false] and for every time that this test succeeds, you can get away without ever needing an activation.
But, you say, you still need a pointer to the method bytes and a stack frame, and this is true, but you don't need to allocate and initialize a full context, nor to transfer the arguments. The idea is that, when you hit the send, you do the lookup, find the method, and then jump to a *separate copy* of the interpreter that has a different set of bytecode service routines. For instance, 'loadTemp' will, depending on the argument count, load from the stack of the calling method (which is still the "active" context). 'Push', since there is no allocated stack, pushes into a static array and, eg, 'plus' does the same old add, but it gets its args from the static array, and puts its result back there. And if anything fancy, such as a real send, does occur, then a special routine is called to do a real activation, copy this static state into it appropriately, and retry the bytecode in the normal interpreter.
It's probably worth confirming the results that I remember, but I wouldn't be surprised if one could almost double the speed of Squeak in this manner.
Cloned Activation This one I just thought of, but I can't believe someone hasn't already tried it, either in squeak or some similar system. The idea here is to provide a field in the method cache for an extra copy of a properly initialized context for the method (ie, correct frame size, method installed, pc and stack pointer set, etc). Then, when a send occurs, all you have to do is an array copy into blank storage, followed by a short copy of receiver and args from the calling stack.
There's a space penalty for carrying all the extra context templates, of course, but I think it's not unreasonable. Also, one could avoid it for all one-time code by only allocating the extra clone on the second call (ie, first call gets it into the method cache; second call allocates clone for the cache).
I have little sense of how much this might help these days -- I haven't looked in detail at the activation code for quite a while. Obviously the worse it si right now, the more this technique might help.
Mainly I just like to think about this stuff, and it occurred to me that, if someone were looking for a fun experiment or two, it might turn out to have some practical value. I haven't looked at Exupery to know whether these things are already being done, or whether they might fit well with the other techniques there, but I'm sure Bryce could say right off the bat.
- Dan
Dan,
Lazy Activation
I included a slightly related idea in a 4 bit Smalltalk:
http://www.merlintec.com:8080/Hardware/dietST
This had an "enter" bytecode for explicitly creating a new context and a "grabArg" bytecode from moving stuff from the sender's bytecode to the newly created one. The idea was that the compiler would generate the bytecodes for this as late in a method as possible (in the best cases - never). This was inspired by the Smalltalks that defer the creation of temporary variables until their first assignment. This static solution is not as powerful as your dynamic one, but it does have a few things in common.
This project only got as far as a SmaCC compiler for these bytecodes in Squeak, so I never got any dynamic statistics for this;
Cloned Activation
This is what I did in NeoLogo:
http://www.merlintec.com/pegasus2000/e_neologo.html
NeoLogo was just a paper design but this feature was also present in the "SuperLogo" which I implemented in 1983 in TI99/4A Extended BASIC. It worked great and actually makes the run time simpler at the cost of slightly complicating the parser. Self actually explains method activation in this way to the users (though the implementation is radically different) because it is easier to understand than the traditional schemes.
-- Jecel
Hi Dan --
Wrt your first optimization ... The SCHEME guys used similar arguments in one or more of the SCHEME papers to show that there are many cases in which no stack has to be allocated or popped (so a simple goto in the code will do the job). Someone on the Squeak list probably has a reference to the paper or papers I'm talking about. Sounds like a good idea, and should work pretty well. The second idea sounds like it should work very well also.
Cheers,
Alan
At 06:11 AM 3/22/2006, Dan Ingalls wrote:
Hi -
I've got my little Squeak in Java running (hope to send out a link soon), and I've been pondering how to make it run faster. In the process, I've thought of two techniques, one of which is new (to me) and the other occurred to me years ago, but I never tried it out.
Since neither would really be all that hard to do in Squeak, I thought I'd mention them here for those folks who delight in such things, and with the further hope that someone might actually try them out.
Lazy Activation This was the next thing I was going to do for Apple Smalltalk back when I got drafted to the hotel business back in 1987. The essence of the idea is that the purpose of activating a context is to save the execution state in case you have to do a send and, conversely, you don't really need an activation if you never need to do a real send.
I had a lot of fun instrumenting the VM to figure out just how many activations could be avoided in this way, and my recollection is that it was roughly 50%. I believe the statistics were better dynamically than statically, because there are a lot of methods that, in general need to be activated, but they may begin with a test such as position > limit ifTrue: [^ false] and for every time that this test succeeds, you can get away without ever needing an activation.
But, you say, you still need a pointer to the method bytes and a stack frame, and this is true, but you don't need to allocate and initialize a full context, nor to transfer the arguments. The idea is that, when you hit the send, you do the lookup, find the method, and then jump to a *separate copy* of the interpreter that has a different set of bytecode service routines. For instance, 'loadTemp' will, depending on the argument count, load from the stack of the calling method (which is still the "active" context). 'Push', since there is no allocated stack, pushes into a static array and, eg, 'plus' does the same old add, but it gets its args from the static array, and puts its result back there. And if anything fancy, such as a real send, does occur, then a special routine is called to do a real activation, copy this static state into it appropriately, and retry the bytecode in the normal interpreter.
It's probably worth confirming the results that I remember, but I wouldn't be surprised if one could almost double the speed of Squeak in this manner.
Cloned Activation This one I just thought of, but I can't believe someone hasn't already tried it, either in squeak or some similar system. The idea here is to provide a field in the method cache for an extra copy of a properly initialized context for the method (ie, correct frame size, method installed, pc and stack pointer set, etc). Then, when a send occurs, all you have to do is an array copy into blank storage, followed by a short copy of receiver and args from the calling stack.
There's a space penalty for carrying all the extra context templates, of course, but I think it's not unreasonable. Also, one could avoid it for all one-time code by only allocating the extra clone on the second call (ie, first call gets it into the method cache; second call allocates clone for the cache).
I have little sense of how much this might help these days -- I haven't looked in detail at the activation code for quite a while. Obviously the worse it si right now, the more this technique might help.
Mainly I just like to think about this stuff, and it occurred to me that, if someone were looking for a fun experiment or two, it might turn out to have some practical value. I haven't looked at Exupery to know whether these things are already being done, or whether they might fit well with the other techniques there, but I'm sure Bryce could say right off the bat.
- Dan
Alan - Are you referring to this paper:
LAMBDA: The Ultimate GOTO http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-443.pdf
Or one the others: http://library.readscheme.org/page1.html
Sidenote: In googling for this paper, I found that Richard Gabriel was going to publish a collection of Guy Steele's papers in book form--but this doesn't appear to have happened. See http:// www.dreamsongspress.com. I, for one, would buy this in an instant.
david
On Mar 22, 2006, at 9:11 AM, Alan Kay wrote:
Hi Dan --
Wrt your first optimization ... The SCHEME guys used similar arguments in one or more of the SCHEME papers to show that there are many cases in which no stack has to be allocated or popped (so a simple goto in the code will do the job). Someone on the Squeak list probably has a reference to the paper or papers I'm talking about. Sounds like a good idea, and should work pretty well. The second idea sounds like it should work very well also.
Cheers,
Alan
At 06:11 AM 3/22/2006, Dan Ingalls wrote:
Hi -
I've got my little Squeak in Java running (hope to send out a link soon), and I've been pondering how to make it run faster. In the process, I've thought of two techniques, one of which is new (to me) and the other occurred to me years ago, but I never tried it out.
Since neither would really be all that hard to do in Squeak, I thought I'd mention them here for those folks who delight in such things, and with the further hope that someone might actually try them out.
Lazy Activation This was the next thing I was going to do for Apple Smalltalk back when I got drafted to the hotel business back in 1987. The essence of the idea is that the purpose of activating a context is to save the execution state in case you have to do a send and, conversely, you don't really need an activation if you never need to do a real send.
I had a lot of fun instrumenting the VM to figure out just how many activations could be avoided in this way, and my recollection is that it was roughly 50%. I believe the statistics were better dynamically than statically, because there are a lot of methods that, in general need to be activated, but they may begin with a test such as position > limit ifTrue: [^ false] and for every time that this test succeeds, you can get away without ever needing an activation.
But, you say, you still need a pointer to the method bytes and a stack frame, and this is true, but you don't need to allocate and initialize a full context, nor to transfer the arguments. The idea is that, when you hit the send, you do the lookup, find the method, and then jump to a *separate copy* of the interpreter that has a different set of bytecode service routines. For instance, 'loadTemp' will, depending on the argument count, load from the stack of the calling method (which is still the "active" context). 'Push', since there is no allocated stack, pushes into a static array and, eg, 'plus' does the same old add, but it gets its args from the static array, and puts its result back there. And if anything fancy, such as a real send, does occur, then a special routine is called to do a real activation, copy this static state into it appropriately, and retry the bytecode in the normal interpreter.
It's probably worth confirming the results that I remember, but I wouldn't be surprised if one could almost double the speed of Squeak in this manner.
Cloned Activation This one I just thought of, but I can't believe someone hasn't already tried it, either in squeak or some similar system. The idea here is to provide a field in the method cache for an extra copy of a properly initialized context for the method (ie, correct frame size, method installed, pc and stack pointer set, etc). Then, when a send occurs, all you have to do is an array copy into blank storage, followed by a short copy of receiver and args from the calling stack.
There's a space penalty for carrying all the extra context templates, of course, but I think it's not unreasonable. Also, one could avoid it for all one-time code by only allocating the extra clone on the second call (ie, first call gets it into the method cache; second call allocates clone for the cache).
I have little sense of how much this might help these days -- I haven't looked in detail at the activation code for quite a while. Obviously the worse it si right now, the more this technique might help.
Mainly I just like to think about this stuff, and it occurred to me that, if someone were looking for a fun experiment or two, it might turn out to have some practical value. I haven't looked at Exupery to know whether these things are already being done, or whether they might fit well with the other techniques there, but I'm sure Bryce could say right off the bat.
- Dan
Dan Ingalls writes:
Mainly I just like to think about this stuff, and it occurred to me that, if someone were looking for a fun experiment or two, it might turn out to have some practical value. I haven't looked at Exupery to know whether these things are already being done, or whether they might fit well with the other techniques there, but I'm sure Bryce could say right off the bat.
Exupery currently creates contexts using the same code as the interpreter does. It calls a C/Slang helper function to set up the new context.
Exupery is about 2.5 times faster than the interpreter for sends which indicates that most of the time is spent figuring out what method is should be executed rather than in creating the context. The speed improvement comes from using polymorphic inline caches which make sends to compiled code from compiled code dispatch very quickly. I'd guess that by tuning the current system and producing custom machine code for the common case where the new context is recycled it would double send performance to about 5 times faster than the interpreter.
My plan current to introduce dynamic method inlining based heavily on Urs Holzle's Self work to Exupery after finishing a 1.0. That will completely remove the context creation costs from the most frequently used sends. Dynamic method inlining has the advantage that it can eliminate the sends from #do: loops as well as from leaf methods.
However, for Exupery, finishing 1.0 is much more important than adding dynamic method inlining. A mere 2.5 times gain in send performance is enough to provide a practical speed improvement. For now my time is better spent first debugging compiled blocks then fixing minor issues that limit Exupery's current usefulness.
Bryce
Bryce Kampjes bryce@kampjes.demon.co.uk wrote...
My plan current to introduce dynamic method inlining based heavily on Urs Holzle's Self work to Exupery after finishing a 1.0. That will completely remove the context creation costs from the most frequently used sends. Dynamic method inlining has the advantage that it can eliminate the sends from #do: loops as well as from leaf methods.
I agree that inlining is the ultimate way to go here. You can see lazy activation as a sort of lazy approach to inlining, but inlining is better because it eliminates the lookup and context switch times completely (when possible (which is typically very often)).
However, for Exupery, finishing 1.0 is much more important than adding dynamic method inlining. A mere 2.5 times gain in send performance is enough to provide a practical speed improvement. For now my time is better spent first debugging compiled blocks then fixing minor issues that limit Exupery's current usefulness.
I certainly agree that a bird in the hand is worth two in the bush, and I'm especially glad that you feel that way. Let's hear it for completion of 1.0!
- Dan
Dan Ingalls writes:
Bryce Kampjes bryce@kampjes.demon.co.uk wrote...
My plan current to introduce dynamic method inlining based heavily on Urs Holzle's Self work to Exupery after finishing a 1.0. That will completely remove the context creation costs from the most frequently used sends. Dynamic method inlining has the advantage that it can eliminate the sends from #do: loops as well as from leaf methods.
I agree that inlining is the ultimate way to go here. You can see lazy activation as a sort of lazy approach to inlining, but inlining is better because it eliminates the lookup and context switch times completely (when possible (which is typically very often)).
Thanks Dan, Inlining is definitely a key optimisation. For Exupery another advantage is that it creates large native methods that have enough code to be able to be classically optimised. Small methods get in the way of many optimisations. Inlining will provide both a great speed improvement and also make later optimisations possible.
The lazy activation approach you describe is orthogonal to inlining. Both ideally would be best. Especially if the lazy context creation is done along the lines Jecel described by moving the context creation code forward to the first send. A truly dynamic approach which lazily creates a context like lazy initialisation risks spending more time evaluating the context creation checks than was spent creating the context.
The idea of cloning contexts is effectively the same as creating custom machine code to generate the context. The advantage of custom machine code is there is no loop overhead and no branch mispredict when exiting the loop.
For the interpreter this may be a decent win. The biggest issue is the speed improvement that Exupery has over the interpreter indicates that most of the time during sends is being spent outside of context creation. But if you're after a 10% gain or you've got an engine with different performance characteristics it could be worth trying.
Bryce
Hi
I wanted to know how this was relating to the way VW treats blocks: clean block [:each | each zork], copy blocks and full blocks. Does anybody able to compare?
Stef
On 22 mars 06, at 15:11, Dan Ingalls wrote:
Hi -
I've got my little Squeak in Java running (hope to send out a link soon), and I've been pondering how to make it run faster. In the process, I've thought of two techniques, one of which is new (to me) and the other occurred to me years ago, but I never tried it out.
Since neither would really be all that hard to do in Squeak, I thought I'd mention them here for those folks who delight in such things, and with the further hope that someone might actually try them out.
Lazy Activation This was the next thing I was going to do for Apple Smalltalk back when I got drafted to the hotel business back in 1987. The essence of the idea is that the purpose of activating a context is to save the execution state in case you have to do a send and, conversely, you don't really need an activation if you never need to do a real send.
I had a lot of fun instrumenting the VM to figure out just how many activations could be avoided in this way, and my recollection is that it was roughly 50%. I believe the statistics were better dynamically than statically, because there are a lot of methods that, in general need to be activated, but they may begin with a test such as position > limit ifTrue: [^ false] and for every time that this test succeeds, you can get away without ever needing an activation.
But, you say, you still need a pointer to the method bytes and a stack frame, and this is true, but you don't need to allocate and initialize a full context, nor to transfer the arguments. The idea is that, when you hit the send, you do the lookup, find the method, and then jump to a *separate copy* of the interpreter that has a different set of bytecode service routines. For instance, 'loadTemp' will, depending on the argument count, load from the stack of the calling method (which is still the "active" context). 'Push', since there is no allocated stack, pushes into a static array and, eg, 'plus' does the same old add, but it gets its args from the static array, and puts its result back there. And if anything fancy, such as a real send, does occur, then a special routine is called to do a real activation, copy this static state into it appropriately, and retry the bytecode in the normal interpreter.
It's probably worth confirming the results that I remember, but I wouldn't be surprised if one could almost double the speed of Squeak in this manner.
Cloned Activation This one I just thought of, but I can't believe someone hasn't already tried it, either in squeak or some similar system. The idea here is to provide a field in the method cache for an extra copy of a properly initialized context for the method (ie, correct frame size, method installed, pc and stack pointer set, etc). Then, when a send occurs, all you have to do is an array copy into blank storage, followed by a short copy of receiver and args from the calling stack.
There's a space penalty for carrying all the extra context templates, of course, but I think it's not unreasonable. Also, one could avoid it for all one-time code by only allocating the extra clone on the second call (ie, first call gets it into the method cache; second call allocates clone for the cache).
I have little sense of how much this might help these days -- I haven't looked in detail at the activation code for quite a while. Obviously the worse it si right now, the more this technique might help.
Mainly I just like to think about this stuff, and it occurred to me that, if someone were looking for a fun experiment or two, it might turn out to have some practical value. I haven't looked at Exupery to know whether these things are already being done, or whether they might fit well with the other techniques there, but I'm sure Bryce could say right off the bat.
- Dan
stéphane ducasse wrote:
Hi
I wanted to know how this was relating to the way VW treats blocks: clean block [:each | each zork], copy blocks and full blocks. Does anybody able to compare?
These are different things. The optimized blocks in VW still require full context activations, they just avoid issues with references to their creating context: A clean block is completely independent of its context, so VW creates the block at compile time and stores it in the literal frame of the method. A copying block needs some values from the context which are known not to be changeable after the block has been created (method receiver and arguments and variables which are never assigned to after the block's creation), so these values can be copied into the newly created block, but the block does not need a reference to its creating context, so that context does not have to be stabilized when the method returns. A full block needs a reference to its context, either because it contains a return or because it reads variables which may change after its creation, or writes into temporaries outside of its own scope.
In contrast, Dan's scheme does not deal with blocks but activations in general, and it tries to avoid creating a stack frame if possible. IMO it is an optimization that should be investigated.
Cheers, Hans-Martin
Tx
On 23 mars 06, at 19:44, Hans-Martin Mosner wrote:
stéphane ducasse wrote:
Hi
I wanted to know how this was relating to the way VW treats blocks: clean block [:each | each zork], copy blocks and full blocks. Does anybody able to compare?
These are different things. The optimized blocks in VW still require full context activations, they just avoid issues with references to their creating context: A clean block is completely independent of its context, so VW creates the block at compile time and stores it in the literal frame of the method. A copying block needs some values from the context which are known not to be changeable after the block has been created (method receiver and arguments and variables which are never assigned to after the block's creation), so these values can be copied into the newly created block, but the block does not need a reference to its creating context, so that context does not have to be stabilized when the method returns. A full block needs a reference to its context, either because it contains a return or because it reads variables which may change after its creation, or writes into temporaries outside of its own scope.
In contrast, Dan's scheme does not deal with blocks but activations in general, and it tries to avoid creating a stack frame if possible. IMO it is an optimization that should be investigated.
Cheers, Hans-Martin
Marcus Denker wrote:
The interesting thing is that a quite large percentage of those come from the beloved
someCollection size == 0 ifTrue: []
pattern that lots of people like so much... "calling isEmpty is too slow" they will tell you, (and ifEmpty: is *really* evil). As the main objective is speed, they of course don't use #=.
It could be premature optimization, but then again it could be lack of familiarity with the idiom. The above snippet is a straight translation of
if (someCollection.size() == 0) {}
which is pretty common practise in curly-brace languages. Of course, if curly-braces are what you're used to, sending #== rather than #= is an easy mistake to make.
I've been writing a fair amount of javascript lately, and been bitten by the reverse mistake, doing assignment when I want to test for equality.
"Dan Ingalls" Dan.Ingalls@Post.Harvard.edu wrote: <snip>
I regret that I don't have time to fix these right now. However, if there is a well-intentioned soul out there, he or she will perhaps find the method below to be quite useful. It found 165 methods in my system with this pattern.
I'll do it, in my (virgin) 3.9a-6721.
frank
"Frank Shearar" frank.shearar@angband.za.org volunteered:
"Dan Ingalls" Dan.Ingalls@Post.Harvard.edu wrote:
<snip> > I regret that I don't have time to fix these right now. However, if > there is a well-intentioned soul out there, he or she will perhaps > find the method below to be quite useful. It found 165 methods in my > system with this pattern.
I'll do it, in my (virgin) 3.9a-6721.
OK, I made a 166kB changeset, and the mail bounced (the attachment was too large). That's probably for the best, because it forces me to try make mcds.
Now that I know how, I'll post the mcds to Mantis soon. I guess that the maintainers of the various packages can take a look at the changes there?
frank
"Frank Shearar" frank.shearar@angband.za.org wrote:
"Frank Shearar" frank.shearar@angband.za.org volunteered:
"Dan Ingalls" Dan.Ingalls@Post.Harvard.edu wrote:
<snip> > I regret that I don't have time to fix these right now. However, if > there is a well-intentioned soul out there, he or she will perhaps > find the method below to be quite useful. It found 165 methods in my > system with this pattern.
I'll do it, in my (virgin) 3.9a-6721.
OK, I made a 166kB changeset, and the mail bounced (the attachment was too large). That's probably for the best, because it forces me to try make
mcds.
Now that I know how, I'll post the mcds to Mantis soon. I guess that the maintainers of the various packages can take a look at the changes there?
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
"Frank Shearar" frank.shearar@angband.za.org wrote:
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
Yes, I included such functionality in PackageInfoExtras (in the changesorter menues) - but I am afraid it is slightly broken in the current version on SM - and right now I don't recall exactly what it was.
My original changeset with this function works IIRC. It is on Mantis:
http://bugs.impara.de/view.php?id=1730
regards, Göran
This is really strange. I do not have time now to check but we will.
Stef
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
Hi Frank,
I suggest to just use MCZ (no diffing). Monticello nicely deals with merging different versions (compared to changesets vs fileOuts).
Adrian
On Feb 14, 2006, at 11:12 , stéphane ducasse wrote:
This is really strange. I do not have time now to check but we will.
Stef
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
As per Adrian's suggestion I've uploaded all the MCZs to the Mantis bug report (http://bugs.impara.de/view.php?id=2788) - except for Morphic. The Upload File section quotes a maximum file size of 2,000k, and Morphic-fbs.66 is 1,084,635 bytes, yet Mantis complains that that file's too large. Could it be that the maximum file size is actually 1000K? The next-largest file I uploaded was 792K.
Any suggestions how to get the Morphic file up? I'd love to make an MCD from Morphic-fbs.66 and Morphic-md.65, for instance. Ooh, hang on! I see a Morphic-fbs.66(md.65).mcd file in my package-cache! OK, I've uploaded that to the bug report. Hopefully that'll work for propogating my changes.
frank
----- Original Message ----- From: "Adrian Lienhard" adi@netstyle.ch To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 12:17 PM Subject: Re: Use of == for arithmetic equality
Hi Frank,
I suggest to just use MCZ (no diffing). Monticello nicely deals with merging different versions (compared to changesets vs fileOuts).
Adrian
On Feb 14, 2006, at 11:12 , stéphane ducasse wrote:
This is really strange. I do not have time now to check but we will.
Stef
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
Actually, you should have uploaded the MCZs to
http://source.squeakfoundation.org/inbox.html
Which works nice and easy using Monticello itself ...
- Bert -
Am 14.02.2006 um 12:30 schrieb Frank Shearar:
As per Adrian's suggestion I've uploaded all the MCZs to the Mantis bug report (http://bugs.impara.de/view.php?id=2788) - except for Morphic. The Upload File section quotes a maximum file size of 2,000k, and Morphic-fbs.66 is 1,084,635 bytes, yet Mantis complains that that file's too large. Could it be that the maximum file size is actually 1000K? The next- largest file I uploaded was 792K.
Any suggestions how to get the Morphic file up? I'd love to make an MCD from Morphic-fbs.66 and Morphic-md.65, for instance. Ooh, hang on! I see a Morphic-fbs.66(md.65).mcd file in my package-cache! OK, I've uploaded that to the bug report. Hopefully that'll work for propogating my changes.
frank
----- Original Message ----- From: "Adrian Lienhard" adi@netstyle.ch To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 12:17 PM Subject: Re: Use of == for arithmetic equality
Hi Frank,
I suggest to just use MCZ (no diffing). Monticello nicely deals with merging different versions (compared to changesets vs fileOuts).
Adrian
On Feb 14, 2006, at 11:12 , stéphane ducasse wrote:
This is really strange. I do not have time now to check but we will.
Stef
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
Ah, OK. I thought that fixes were supposed to go to Mantis first, then to inbox after the fixes had been vetted. I'm sorry if I've made more work for everyone!
Perhaps I should leave the files where they are though; otherwise we'll end up with two sets of fixes and people merging the fixes might get confused? Or would you prefer I leave a note in Mantis, and save all the changes to inbox?
frank ----- Original Message ----- From: "Bert Freudenberg" bert@impara.de To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 1:41 PM Subject: Re: Use of == for arithmetic equality
Actually, you should have uploaded the MCZs to
http://source.squeakfoundation.org/inbox.html
Which works nice and easy using Monticello itself ...
- Bert -
Am 14.02.2006 um 12:30 schrieb Frank Shearar:
As per Adrian's suggestion I've uploaded all the MCZs to the Mantis bug report (http://bugs.impara.de/view.php?id=2788) - except for Morphic. The Upload File section quotes a maximum file size of 2,000k, and Morphic-fbs.66 is 1,084,635 bytes, yet Mantis complains that that file's too large. Could it be that the maximum file size is actually 1000K? The next- largest file I uploaded was 792K.
Any suggestions how to get the Morphic file up? I'd love to make an MCD from Morphic-fbs.66 and Morphic-md.65, for instance. Ooh, hang on! I see a Morphic-fbs.66(md.65).mcd file in my package-cache! OK, I've uploaded that to the bug report. Hopefully that'll work for propogating my changes.
frank
----- Original Message ----- From: "Adrian Lienhard" adi@netstyle.ch To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 12:17 PM Subject: Re: Use of == for arithmetic equality
Hi Frank,
I suggest to just use MCZ (no diffing). Monticello nicely deals with merging different versions (compared to changesets vs fileOuts).
Adrian
On Feb 14, 2006, at 11:12 , stéphane ducasse wrote:
This is really strange. I do not have time now to check but we will.
Stef
Er, when I make these MCDs (mark the repository as storing diffs, hit the Save button) my image (3.9a-6721) pops up a never-ending sequence of "Diffing..." messages, but never seems to stop. I mean, I'm trying to save the Collections mcd, which altered about 10 or so methods, and the saving process has already taken 7 minutes! Am I doing something crazily wrong?
My previous attempt was on the Traits package, and that took at least an hour before I gave up.
Is there a way to split up a ChangeSet into a set of per-package change sets? Then I can post those to Mantis instead of MCDs.
frank
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Ah, OK. I thought that fixes were supposed to go to Mantis first, then to inbox after the fixes had been vetted. I'm sorry if I've made more work for everyone!
Yes, you're right. And Bert isn't ;-). Package teams should be the only ones that publish to the integration inbox, because they are the ones that decide whether a package version is integration-ready or not. So putting them on Mantis is the correct action (Mantis is in effect the inbox of the package teams).
sure but this does not work when you have cross cutting changes. For cross cutting changes like the fix of lukas on TextAnchor if we would have waited for just the feedback of the teams involved this would not be in the image yet.
Stef
On 14 févr. 06, at 13:06, Cees De Groot wrote:
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Ah, OK. I thought that fixes were supposed to go to Mantis first, then to inbox after the fixes had been vetted. I'm sorry if I've made more work for everyone!
Yes, you're right. And Bert isn't ;-). Package teams should be the only ones that publish to the integration inbox, because they are the ones that decide whether a package version is integration-ready or not. So putting them on Mantis is the correct action (Mantis is in effect the inbox of the package teams).
Am 14.02.2006 um 13:06 schrieb Cees De Groot:
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Ah, OK. I thought that fixes were supposed to go to Mantis first, then to inbox after the fixes had been vetted. I'm sorry if I've made more work for everyone!
Yes, you're right. And Bert isn't ;-). Package teams should be the only ones that publish to the integration inbox, because they are the ones that decide whether a package version is integration-ready or not. So putting them on Mantis is the correct action (Mantis is in effect the inbox of the package teams).
Well, the wiki entry at http://source.squeakfoundation.org/inbox.html still says
"This is a place for code to be posted for possible inclusion in 3.9alpha (and future Squeak versions). The code can be referenced from a Mantis submission."
Seems I didn't notice the change of rules in the mean time.
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
- Bert -
On 2/14/06, Bert Freudenberg bert@impara.de wrote:
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
Absolutely. I'd publish that sort of stuff in my own repository in such a case and add pointers to Mantis.
"Cees De Groot" cdegroot@gmail.com wrote:
On 2/14/06, Bert Freudenberg bert@impara.de wrote:
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
Absolutely. I'd publish that sort of stuff in my own repository in such a case and add pointers to Mantis.
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
frank
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
Neither. The regular flow is community -> mantis -> package team -> inbox -> release team. In that way, we ensure that the package team controls what is released and when, plus that they don't need to look left and right for inbound stuff. And it keeps the inbox clean (which helps the release team in evaluating how much stuff is available there).
It's all a bit harder with this new system for cross-cutting patches, but that's the trade-off we made when moving to the team model (and, with it, giving the team ultimate responsibility for what is released - responsibility and control go hand in hand here, as they should)
"Cees De Groot" cdegroot@gmail.com answered:
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Er, so should I save the in-image package MCZs to the inbox repository?
Or
will whoever manages the inbox grab the MCZs from the Mantis bug report?
Neither. The regular flow is community -> mantis -> package team -> inbox -> release team. In that way, we ensure that the package team controls what is released and when, plus that they don't need to look left and right for inbound stuff. And it keeps the inbox clean (which helps the release team in evaluating how much stuff is available there).
OK, which means that I don't need to do anything further then, yes?
frank
On 14 févr. 06, at 16:13, Cees De Groot wrote:
On 2/14/06, Frank Shearar frank.shearar@angband.za.org wrote:
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
Neither. The regular flow is community -> mantis -> package team -> inbox -> release team. In that way, we ensure that the package team controls what is released and when, plus that they don't need to look left and right for inbound stuff. And it keeps the inbox clean (which helps the release team in evaluating how much stuff is available there).
It's all a bit harder with this new system for cross-cutting patches, but that's the trade-off we made when moving to the team model (and, with it, giving the team ultimate responsibility for what is released
- responsibility and control go hand in hand here, as they should)
Sure cees this is working well for simple package oriented fixes. But now it would be good that we get the changes! Look at the Network team for example. Should frank stack that in the team and we get something in the future (instead of now and getting done). And each team can do a merge after. Else we can have endless discussions. See the Fix of TextAnchor of lukas.
Stef
stéphane ducasse wrote:
Sure cees this is working well for simple package oriented fixes. But now it would be good that we get the changes! Look at the Network team for example. Should frank stack that in the team and we get something in the future (instead of now and getting done).
Given that this code does neither have any cross-cutting requirements and doesn't even fix anything that's broken I don't see why you have the urge to push this in right now. As a matter of fact, I'm somewhat concerned about these changes and would like them to be reviewed - there are various places where the pattern "foo == 0" is absolutely appropriate and where #= should *not* be used and I wonder whether these places have been taken into account properly.
And each team can do a merge after. Else we can have endless discussions. See the Fix of TextAnchor of lukas.
Where was this endless discussion?
Cheers, - Andreas
On 14 févr. 06, at 16:41, Andreas Raab wrote:
stéphane ducasse wrote:
Sure cees this is working well for simple package oriented fixes. But now it would be good that we get the changes! Look at the Network team for example. Should frank stack that in the team and we get something in the future (instead of now and getting done).
Given that this code does neither have any cross-cutting requirements and doesn't even fix anything that's broken I don't see why you have the urge to push this in right now. As a matter of fact, I'm somewhat concerned about these changes and would like them to be reviewed - there are various places where the pattern "foo == 0" is absolutely appropriate and where #= should *not* be used and I wonder whether these places have been taken into account properly.
Indeed you are right. So have a look and let us know. I still think that this is important that we find a process to - give fast feedback on changes - find a way for cross cutting changes.
And each team can do a merge after. Else we can have endless discussions. See the Fix of TextAnchor of lukas.
Where was this endless discussion?
Endless or no discussion is not the same? I sent a post then I got no reaction (not even a complain :)), so we included it since this was blocking lukas for enh for icons support in browser. and we did not want to have code rot when this is simple.
Cheers,
- Andreas
Hi frank
publish in the inbox and we will take them from there.
Stef
On 14 févr. 06, at 16:02, Frank Shearar wrote:
"Cees De Groot" cdegroot@gmail.com wrote:
On 2/14/06, Bert Freudenberg bert@impara.de wrote:
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
Absolutely. I'd publish that sort of stuff in my own repository in such a case and add pointers to Mantis.
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
frank
Great. I've just started, with the Collections package.
frank ----- Original Message ----- From: "stéphane ducasse" ducasse@iam.unibe.ch To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 5:19 PM Subject: Re: Posting Fixes (was Re: Use of == for arithmetic equality)
Hi frank
publish in the inbox and we will take them from there.
Stef
On 14 févr. 06, at 16:02, Frank Shearar wrote:
"Cees De Groot" cdegroot@gmail.com wrote:
On 2/14/06, Bert Freudenberg bert@impara.de wrote:
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
Absolutely. I'd publish that sort of stuff in my own repository in such a case and add pointers to Mantis.
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
frank
After adding what new versions I could to the inbox repository, we have:
Packages not in inbox: Etoys Monticello MorphicExtras PlusTools Traits
Packages written to inbox: Collections CollectionsTests Compiler Graphics Kernel KernelTests Morphic Network Protocols SmaCC ST80 System Tools
Packages not updated because of potential conflicts: Multilingual
frank ----- Original Message ----- From: "stéphane ducasse" ducasse@iam.unibe.ch To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Tuesday, February 14, 2006 5:19 PM Subject: Re: Posting Fixes (was Re: Use of == for arithmetic equality)
Hi frank
publish in the inbox and we will take them from there.
Stef
On 14 févr. 06, at 16:02, Frank Shearar wrote:
"Cees De Groot" cdegroot@gmail.com wrote:
On 2/14/06, Bert Freudenberg bert@impara.de wrote:
Also, I'd prefer directly publishing to a MC repository rather than uploading to Mantis.
Absolutely. I'd publish that sort of stuff in my own repository in such a case and add pointers to Mantis.
Er, so should I save the in-image package MCZs to the inbox repository? Or will whoever manages the inbox grab the MCZs from the Mantis bug report?
frank
On 2/14/06, stéphane ducasse ducasse@iam.unibe.ch wrote:
publish in the inbox and we will take them from there.
Fine, Stef. I take it then you'll do the merge w.r.t. the networking code and file code, as far as afflicted. I'll ask the I/O team to suspend any work until that's done.
Fine, Stef. I take it then you'll do the merge w.r.t. the networking code and file code, as far as afflicted. I'll ask the I/O team to suspend any work until that's done.
I was too fast to react and andreas is right so we need to get feedback on that changes first.
Andreas had said that foo == 0 was sometimes the right thing. I have difficulty imagining many such cases (and those that I can imagine, I think of as kludges), so I'm curious what he had in mind.
../Dave
Dave Mason wrote:
Andreas had said that foo == 0 was sometimes the right thing. I have difficulty imagining many such cases (and those that I can imagine, I think of as kludges), so I'm curious what he had in mind.
Here is an example:
SystemNavigation>>allObjectsDo: aBlock "Evaluate the argument, aBlock, for each object in the system excluding SmallIntegers." | object | object _ self someObject. [0 == object] whileFalse: [ aBlock value: object. object _ object nextObject]
The reason this is correct is that for proxies you want the operation to be side-effect free and #== is side-effect free and #= may not. In addition, I sometimes use #== in critical code to bullet-proof against arbitrarily broken implementations of #= (but that's another rant for another time).
Cheers, - Andreas
Thanks frank and we will merge that in 3.9 as soon as this is done.
Stef
On 13 févr. 06, at 21:28, Frank Shearar wrote:
"Dan Ingalls" Dan.Ingalls@Post.Harvard.edu wrote:
<snip> > I regret that I don't have time to fix these right now. However, if > there is a well-intentioned soul out there, he or she will perhaps > find the method below to be quite useful. It found 165 methods in my > system with this pattern.
I'll do it, in my (virgin) 3.9a-6721.
frank
Dan Ingalls Dan.Ingalls@Post.Harvard.edu wrote:
Date: Mon, 13 Feb 2006 09:37:42 -0800
...
happened not to intern (ie force unique instances of) SmallIntegers. In this case the use of == to mean arithmetic equality will not work properly. In my opinion, all such occurrences in the system should be eliminated ASAP; == is not an arithmetic compare in any Smalltalk I know of.
But #== and #= is equivalent in ST-80 as described in [Adele Goldberg, David Robson: "Smalltalk-80 The Language", 1989, p 115]:
Objects, that can not change their internal state are called immutable objects. This means, that, once created, they are not destroyed and then recreated when they are needed again. Rather, the 256 instances of Character are created at the time the system is initialized and remain in the system. ... Besides Characters, the Smalltalk-80 system includes SmallIntegers and Symbols as immutable objects.
In the same book, there are expressions like
[p 139] ... we want to know how many of the Characters are a or A. count _ 0. letters do: [:each | each asLowercase == $a ifTrue: [count _ count + 1]]
[p 168] Thus 'a string' asSymbol == 'a string' asSymbol answers true.
etc. It might be a bad style to use #== instead of #=, but this "bad habit" is certainly not rooted in the usage of curly-brace languages alone.
Now, my question. Are SmallIntegers, Characters and Symbols in Squeak immutable objects in the sense of the above definition, i, e. not destroyable? If not, why and when was it changed in Squeak. If they are still immutable, why is this planned to be changed?
Greetings,
whg
On 13-Feb-06, at 4:09 PM, Wolfgang Helbig wrote:
Now, my question. Are SmallIntegers, Characters and Symbols in Squeak immutable objects in the sense of the above definition, i, e. not destroyable? If not, why and when was it changed in Squeak. If they are still immutable, why is this planned to be changed?
SmallIntegers, Characters and Symbols are indeed immutable in Squeak. However, numbers in general are not quite the same; it is entirely possible to have several LargeIntegers with the same numeric value and that is one quite obvious case where using #== instead of #= could provide a surprise. All one has to do is remember tat #== means 'is the same object' and #= means 'is an equal object' to see that confusion is not smart.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim If you think nobody cares about you, try missing a couple of payments.
Hi,
Actually the confusion is in Object class.
All objects can be tested for identity (message #==) but it doesn't make sense to offer messages for equality (message #=) all over the hierarchy. Not all object has a good meaning of equality and, imho, to give default equality (based on identity) make the confusion bigger.
I think to put equality in Object is one the the biggest misconceptions we suffer daily. Only a small set of objects can answer for equality in a senseful way.
If you combine this fact with the awful-object-capabilities of the main-stream languages (where the objects never survive a run), you will get an abuse of uses of equality. Most of the times, in those languages, equality is just used to workaround the problems with the (nonexistent) identity.
Cheers,
-- Diego
El lun, 13-02-2006 a las 16:25 -0800, tim Rowledge escribió:
On 13-Feb-06, at 4:09 PM, Wolfgang Helbig wrote:
Now, my question. Are SmallIntegers, Characters and Symbols in Squeak immutable objects in the sense of the above definition, i, e. not destroyable? If not, why and when was it changed in Squeak. If they are still immutable, why is this planned to be changed?
SmallIntegers, Characters and Symbols are indeed immutable in Squeak. However, numbers in general are not quite the same; it is entirely possible to have several LargeIntegers with the same numeric value and that is one quite obvious case where using #== instead of #= could provide a surprise. All one has to do is remember tat #== means 'is the same object' and #= means 'is an equal object' to see that confusion is not smart.
tim
Wolfgang Helbig helbig@lehre.ba-stuttgart.de wrote...
But #== and #= is equivalent in ST-80 as described in [Adele Goldberg, David Robson: "Smalltalk-80 The Language", 1989, p 115]:
Objects, that can not change their internal state are called immutable objects. This means, that, once created, they are not destroyed and then recreated when they are needed again. Rather, the 256 instances of Character are created at the time the system is initialized and remain in the system. ... Besides Characters, the Smalltalk-80 system includes SmallIntegers and Symbols as immutable objects.
You have to be careful. There are two issues: immutability and "interning" (guaranteeing a unique object for each value). Immutability does not at all guarantee that == will work for arithmetic equality. Squeak's LargeIntegers and Squeak's Floats are designed to be immutable in their protocol, yet == among these is not equivalent to =. It is true that a==b implies a=b, but it is *not* true that "(a==b) not" implies "(a=b) not". To get this second effect requires "interning" wherein the object you get back for a given value is always the same object -- as with characters (there is a table of them), and Symbols (there is a table of them as well). It *happens* that most modern Smalltalks (ie since ST-76 which did *not* have this property) implement SmallIntegers as a pointer with a tag and the value, so any two SmallIntegers of the same value are encoded perforce as the same object, so they are inherently interned.
In the same book, there are expressions like
[p 139] ... we want to know how many of the Characters are a or A. count _ 0. letters do: [:each | each asLowercase == $a ifTrue: [count _ count + 1]]
[p 168] Thus 'a string' asSymbol == 'a string' asSymbol answers true.
These are OK exactly for the above reason.
etc. It might be a bad style to use #== instead of #=, but this "bad habit" is certainly not rooted in the usage of curly-brace languages alone.
I think we cannot excuse this usage either as merely a bad habit, nor as justifiable because it occurs in other languages. it is simply not Squeak arithmetic. I'm sure your curly-brace languages are equally unhappy with the use of = in place of == .
Now, my question. Are SmallIntegers, Characters and Symbols in Squeak immutable objects in the sense of the above definition, i, e. not destroyable? If not, why and when was it changed in Squeak. If they are still immutable, why is this planned to be changed?
Yes, they are immutable, but not all Integers are interned. Just evaluate... (1e10 + 1 - 1) == 1e10 and be convinced.
I know of no plans to change immutability of any of these objects. But there is definitely a plan to stop using == to test arithmetic equality in Squeak ;-).
Thanks for your interest.
- Dan
squeak-dev@lists.squeakfoundation.org