From: "Andreas Raab" < andreas.raab@g... > Hi Guys,
I was always suspicious about the way CCodeGenerator handled #interpret with respect to temps (e.g., inlining all temps into interpret and randomly renaming them t1 ... tN) as it completely spoils life-time analysis for the C compiler (which has to assume that temps may be read in other code branches and may even "optimize" them into wasting unneeded registers across code branches).
First I was using Change Set: CGeneratorEnhancements-ajh Date: 12 February 2002 Author: Anthony Hannan (ajh18@cornell.edu)
which localized the variables in interpret(), but your change set is a cleaner solution.
I downloaded and setup a new image with SM & loaded the latest VMMaker (or so I thing/thought/believe).
Ran into some issues with the version of VMMaker you used and the current one. Tim and you can sort out what's happening.
TMethod lost an instance variable globalStructureBuildMethodHasFoo and an overwrote a change in TMethod>>setSelector: args: locals: block: primitive:
These two I'm unsure about who's at fault. a) Interpreter lost the class variable BlockMethodIndex b) and the method isUnwindMarked: is missing {Isn't that the block closure stuff?}
Also the two variables in interpret() localReturnContext & localReturnValue end up with no declaration.
Well now because I was using Hannan changeset in earlier work, since 3.2.7b1, the difference is too small/difficult to measure. For the GCC flavor I don't think there was any difference in the code size. (40 bytes smaller for the entire VM, but I was missing the UnwindMarked method, so I think that accounts for the 40 bytes).
For CodeWarrior OS9 there was a 46 byte difference for the interpret() function but any improvement is lost in measurement noise. In the past the reason I used Hannan changeset because it was obvious that codewarrior just gave up doing any useful local variable analyses and stuck the first couple of vars into registers and was stupid... Also this made great improvements in how the 68K version worked with GCC on OpenBSD 3.x
From a note of mine to the list on April 9th, 2002 talking about this:
on a 68k BSD box with GCC the new numbers are {Hannan changeset } 1,614,205 bytecodes/sec and 57,652 sends/sec versus my previous one using the jumptable modification 1,550,387 bytecodes/sec and 55,080 sends/sec versus what I started with 1,439,884 bytecodes/sec and 51,098 sends/sec
So yes the change is good.
---------- PS Another topic In my measurements of the macrobenchmark I see 55.9% is interpret() 4.5% is sweepPhase 5.0% is markPhase 3.0% UpdatePointers (spelling?) 0.9% is incCompMove
Thus 10% lurks in the mark/sweep phase of the GC.
Fidding with ObjectMemory>>startField can be measured in the tinybenchmarks. I'm considering check for type 0, else type = 2, otherwise it's a small Integer. That becomes a load with set condition, a branch on condition, a check against 2 and a branch on condition. This improves macrobenchbenchmark by 2%, but degrades then tinybenchmark because of the integers it creates. MMM a case statement! might be useful here...
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===