Hi Bryce,
When running Exupery's test suite it crashes every few runs. With a bad (freshly created) image it can crash every single time that TestExuperyPlugin>>testBasicRememberSetsBothOld is run. When the image is first created the other remembered set tests also cause it to crash. Images get more reliable with time.
Hearing the word "plugin" and considering that I am entirely unable to reproduce your example (the test is still running and absolutely nothing bad happened during -now- 3000 loops) makes me assume that there's something in the plugin which has a GC problem, e.g., a GC occuring in a place where you didn't remap some oop or other.
Interestingly, the first time I run that method with the fixed VM it crashes. If I run any other test first, it is then stable.
I think that actually proves my point.
I can easily reproduce these crashes with the Exupery development image when running Exupery's test suite. The reason I'm investigating is a new image I created was crashing every time I ran that test. My older development image only crashes ever couple of test runs. The problem is subtle, add a few extra expressions and it will go away.
Yes that too - it's precisely the kind of thing that happens if a GC hits you in a place you haven't thought about it. Let me tell you a rather nasty technique to debug (or rather: stress test) these problems:
If you look at "Smalltalk vmParameters" you will find one which defines the "allocations between GCs", e.g., how many allocations before an IGC takes place. If you set this to zero (or one? might try both) the system will run an IGC *every single time* an allocation happens. This is slow (it means about 1-2 ms for every allocated object) but it's the surest way I know to find out if there's a problem - if there is the system will crash almost instantly.
For me, for now, that fix is enough to be able to produce an image to go with the next Exupery release. Without that fix, a newly created image will crash every time the test suite is run. If the troublesome methods are commented and the test suite is run a few times then it those tests will pass sometimes but they still crash the VM every few runs. It would be nice to get a proper fix though.
I will vote against adding this fix unless you can provide evidence that such a crash is indeed possible on a stock VM/image pair. Like I said, from what I can see your fix is a no-op which would only slow down the garbage collector.
Cheers, - Andreas