Tony Garnock-Jones writes:
I'm extremely keen to help out on this front. Could you perhaps write a paragraph or two on the process of running the stress tests? I'd like to see it crashing, and to start to get the experience needed to fix it when it does.
First, Exupery only runs on Linux/x86 at the moment. If you're running that then follow the instructions on either building or installing Exupery from here.
http://minnow.cc.gatech.edu/squeak/3842
If you're planning on debugging the compiler then you'll want to build your own kernel. Debugging normally starts by analysing generated machine code in gdb. I'm not sure how well this would work without a local build.
Once you've got a local working version of Exupery then:
ExuperyProfiler profileAndCompile: [5 timesRepeat: [ExuperyBenchmarks new compilerBenchmark]]
Will profile the expression given and compile the 10 most used methods. This method doesn't do primitive inlining yet so don't expect your bytecode performance to beat VW. I'll add that as soon as the stress test passes.
ExuperyProfiler stressTest
Will run the stress test. In somewhere between 1 and 10 minutes it will crash. Some of the tests freeze, I think this is due to the progress dialog morph but haven't investigated. Hitting Alt-. then proceeding will get past this. The progress dialog morph lock-up is easy to detect, Squeak drops to 0% CPU without crashing.
Once it's crashed look in the Exupery.log file.
If you're lucky the log shows that 10 methods have been compiled since the code cache was initialised. Try compiling these methods manually and figure out which one's causing the crash. Then I normally write a test that reliably creates the crash.
If you're unlucky then the crash was caused by memory corruption. A stack backtrace will probably show that the GC was executing and there may be no compiled methods. Memory corruption crashes can be a big pain because the corruption may have happened a long time ago.
After getting a test that reproduces the bug then have a look at the crash. It's worthwhile to look at both the current active context (foo->activeContext on Linux) and the C stack trace.
A context looks like:
(gdb) x/20x foo->activeContext 0x434d5ccc: 0x1736e35f 0x434d5c70 0x000000b1 0x0000000b 0x434d5cdc: 0x40da1f8c 0x808b415b 0x434dcc50 0x00000015 0x434d5cec: 0x434dcee0 0x4035b004 0x4035b004 0x434dcc50 0x434d5cfc: 0x434dced8 0x00000003 0x000001ff 0x000001ff 0x434d5d0c: 0x0000010d 0x00002b3f 0x434a1ab4 0x434a1ab4
0x000000b1 is the byte code program counter 0x0000000b is the stack pointer
0x808b415b is Exupery's return address.
From this it's possible to figure out that it crashed while executing
a compiled context and trace back to where in the method and which basic block it last entered. Exupery only updates the context when it leaves a method so the stack pointer and Exupery basic block pointer will point to where it re-entered the method.
From here, options involve exploring with gdb break-points, adding
printf statements (self cCode: 'printf("Entering method\n")'. will add a printf in Slang), or adding calls to validation code (Interpreter methods in the "debug support" catagory).
Staring at the generated methods sometimes helps. But less so when debugging random bugs. Place a halt at then end of Exupery>>run then try opening inspectors on the instance variables holding compiled methods (most of them). I save all the stages results in instance variables to make it easier to debug. The inspector will open up a graphical view of the method, it's an animated springs and repulsion Connectors graph. The explorers are normal.
After figuring out why the machine code is crashing then we need to know where Exupery went wrong. This is a game of chasing the bug back through the different versions of the method until we find the first version with a fault.
Bryce
P.S. I'm currently working at London Bridge, that's reasonably close to you isn't it? If you're serious about debugging then some pairing may help.