Hi all.
If you make a naive class that calls itself:
Forkbomb>>forkbomb ^ self forkbomb.
Squeak 3.5 on Debian linux will eventually crash, leaving a huge stack dump in the terminal.
Okay; it's really my fault for making such a stupid class, but I'm playing around with proxied objects, message sends etc and it's happening quite often while I'm debugging.
Is there a way to limit the invocation stack to a sane level of deepness, after which a process/thread (whichever is Squeak terminology) gets suspended and an error thrown?
Same probably applies to number of processes made by a thread.
mikevdg.
Search back through the list archives for the last month or so; I think that Avi Bryant took a first cut at something like this.
Joshua
On Sun, Sep 07, 2003 at 06:00:29PM +0200, Michael van der Gulik wrote:
Hi all.
If you make a naive class that calls itself:
Forkbomb>>forkbomb ^ self forkbomb.
Squeak 3.5 on Debian linux will eventually crash, leaving a huge stack dump in the terminal.
Okay; it's really my fault for making such a stupid class, but I'm playing around with proxied objects, message sends etc and it's happening quite often while I'm debugging.
Is there a way to limit the invocation stack to a sane level of deepness, after which a process/thread (whichever is Squeak terminology) gets suspended and an error thrown?
Same probably applies to number of processes made by a thread.
mikevdg.
I've tried Avi's code; it's really nice, simple code. Unfortunately, on very fast machines on Unix, it gets into some kind of infinite loop of error notifications that cause the Emergency Evaluator to pop up when you test it with:
Smalltalk createStackOverflow.
The low space watcher has a similar problem. I've talked with Ian Piumarta (the Unix vm developer, he was very helpful) about this, and he can reproduce it too; but we don't know what the problem is. Here's what he told me:
------- The low space process tries to open a notifier window using the "headroom" memory that's left over, but if this requires more memory than is available then you're in trouble.
Exactly why this might require more than the default 200K is a mystery, but this seems to be what's happening. I can reproduce the crash reliably in my image using createStackOverflow, which indicates a bug somewhere.
The system shouldn't be allocating lots of memory just to pop up a notifier. From the VM backtrace it looks like the UI process (which is running the recursion) is being reactivated even while the low-space notifier is still open on the screen. ------
The same thing happens to me when I set the low-memory threshold even at 20 *megabytes*.
Best,
Mike
On Sunday 07 September 2003 09:04 am, Joshua 'Schwa' Gargus wrote:
Search back through the list archives for the last month or so; I think that Avi Bryant took a first cut at something like this.
Joshua
On Sun, Sep 07, 2003 at 06:00:29PM +0200, Michael van der Gulik
wrote:
Hi all.
If you make a naive class that calls itself:
Forkbomb>>forkbomb ^ self forkbomb.
Squeak 3.5 on Debian linux will eventually crash, leaving a huge stack dump in the terminal.
Okay; it's really my fault for making such a stupid class, but I'm playing around with proxied objects, message sends etc and it's happening quite often while I'm debugging.
Is there a way to limit the invocation stack to a sane level of deepness, after which a process/thread (whichever is Squeak terminology) gets suspended and an error thrown?
Same probably applies to number of processes made by a thread.
mikevdg.
On Sun, 7 Sep 2003, Michael Fremont wrote:
I've tried Avi's code; it's really nice, simple code. Unfortunately, on very fast machines on Unix, it gets into some kind of infinite loop of error notifications that cause the Emergency Evaluator to pop up when you test it with:
Smalltalk createStackOverflow.
Hm, you're right - I can't get it to work at all on Linux. I wrote and tested it on Mac OS X, where it worked fine.
A recursive loop seems to lock the unix VM *much* harder than it was locking the Mac OS X VM - the interrupt key doesn't work, UI events don't get through - even when the offending process is forked at the lowest priority.
Inserting a call to "Processor yield" into #createStackOverflow fixes these problems. Are the scheduling rules different between VMs?
Well the scheduling rules are the same, but on the os-x carbon VM I've used pthreads to separate the UI from the running of the VM.
Tapping the interrupt key always will be immediately noticed by the UI thread which deposits the keystroke on the VM event queue. In this case the interpreter pthread is grinding away most likely doing a full GC trying to find some memory.
The interpreter pthread of course needs to grind thru the bytecodes to get to the point of grabbing the events from the VM event queue and them processing the interrupt key. I do signal the inputSemaphoreIndex semaphore which is the event queue semaphore, but not sure if that helps any in this case. However that should cause the high priority eventsensor task to wakeup immediately and process the interrupt key stroke.
Can't say why the unix one doesn't have the same behaviour. I wonder the the path length to get the X11 handleEvent() logic is so long or never gets to run so you perhaps you aren't waiting long enough for the keystroke to be polled for?
On Sunday, September 7, 2003, at 11:13 AM, Avi Bryant wrote:
On Sun, 7 Sep 2003, Michael Fremont wrote:
I've tried Avi's code; it's really nice, simple code. Unfortunately, on very fast machines on Unix, it gets into some kind of infinite loop of error notifications that cause the Emergency Evaluator to pop up when you test it with:
Smalltalk createStackOverflow.
Hm, you're right - I can't get it to work at all on Linux. I wrote and tested it on Mac OS X, where it worked fine.
A recursive loop seems to lock the unix VM *much* harder than it was locking the Mac OS X VM - the interrupt key doesn't work, UI events don't get through - even when the offending process is forked at the lowest priority.
Inserting a call to "Processor yield" into #createStackOverflow fixes these problems. Are the scheduling rules different between VMs?
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
On Sunday 07 September 2003 11:48 am, John M McIntosh wrote: ...
Can't say why the unix one doesn't have the same behaviour. I wonder the the path length to get the X11 handleEvent() logic is so long or never gets to run so you perhaps you aren't waiting long enough for the keystroke to be polled for?
I don't think this is the problem, or at least, not the only one. You can get this to happen without involving keystrokes at all.
Set the LowSpaceWatcher threshold to 150 *megabytes* and then run
Smalltalk createStackOverflow
and, in Linux, you'll never get the low space notifier; or if you do, you'll also get an Emergency Evaluator and the system will be locked up, with createStackOverflow *still* running, and an eventual crash.
So this seems deeper than not seeing keystrokes.
Best,
Mike
This appears to be the same problem mentioned in recent reports of not being able to interrupt a recursive process. If something is making the low space signal not work properly - leaving the runaway process careering like a drunken driver down the roadway of memory loss - then one of our major system safeguards is inneffective. Not good.
On Wed, Apr 28, 2004 at 10:36:16PM -0600, tim@sumeru.stanford.edu wrote:
This appears to be the same problem mentioned in recent reports of not being able to interrupt a recursive process. If something is making the low space signal not work properly - leaving the runaway process careering like a drunken driver down the roadway of memory loss - then one of our major system safeguards is inneffective. Not good.
[Background on this thread: Squeak hangs up and crashes with out-of-memory condition due to low space alarm not working. Observed on Unix/Linux VM.]
I can't quite get a handle on what is happening here, but I am fairly sure that it is *not* a problem with the low space signal. The signal is properly created in the VM. It is caught by the waiting low space watcher process, and the notifier is being displayed. *After* these things take place, Squeak crashes with an out of memory condition, as if the original runaway recursive process had proceeded without interruption (although I don't think this is what is literally happening).
I confirmed the above with some good ol' Fortran-style print debugging to stdout, and tested with #createStackOverflow. Unfortunately, I can't get much further with this approach. Adding a debug message in the #createStackOverflow method is enough to make the problem go away, so I can't say for sure if the #createStackOverflow is continuing to run after the low space signal has been received by the low space watcher process.
One other observation: The problem does not occur in MVC. The combination of Morphic and a Unix/Linux VM is apparently needed to make the problem occur.
For what it's worth, here is the stdout (console) output that I get when running #createStackOverflow on my Linux box. The messages are coming from SystemDictionary>>lowSpaceWatcher with debugging print statements added to indicate when the semaphore is installed and when the low space signal is handled.
587:149:SystemDictionary>>lowSpaceWatcher:install low space semaphore 587:149:SystemDictionary>>lowSpaceWatcher:enable low space interrupts 587:149:SystemDictionary>>lowSpaceWatcher:wait on low space semaphore 587:149:SystemDictionary>>lowSpaceWatcher:low space semaphore signal received 587:149:SystemDictionary>>lowSpaceWatcher:about to display low space notifier
out of memory
1114748112 SystemDictionary>createStackOverflow 1114748020 SystemDictionary>createStackOverflow 1114747928 SystemDictionary>createStackOverflow 1114747836 SystemDictionary>createStackOverflow 1114747744 SystemDictionary>createStackOverflow <snipped long stack dump, VM crash>
Consider this an interim report to keep the thread alive.
Dave
"David T. Lewis" lewis@mail.msen.com wrote:
On Wed, Apr 28, 2004 at 10:36:16PM -0600, tim@sumeru.stanford.edu wrote:
This appears to be the same problem mentioned in recent reports of not being able to interrupt a recursive process. If something is making the low space signal not work properly - leaving the runaway process careering like a drunken driver down the roadway of memory loss - then one of our major system safeguards is inneffective. Not good.
[Background on this thread: Squeak hangs up and crashes with out-of-memory condition due to low space alarm not working. Observed on Unix/Linux VM.]
Also on RISC OS and win32 (IIRC)
I can't quite get a handle on what is happening here, but I am fairly sure that it is *not* a problem with the low space signal. The signal is properly created in the VM. It is caught by the waiting low space watcher process, and the notifier is being displayed. *After* these things take place, Squeak crashes with an out of memory condition, as if the original runaway recursive process had proceeded without interruption (although I don't think this is what is literally happening).
I did a whole load of test runs for this a while back and it really does seem like the problem proces keps on running despite the notifier.
One other observation: The problem does not occur in MVC. The combination of Morphic and a Unix/Linux VM is apparently needed to make the problem occur.
Just tried again for 3.7, 3.6, 3.5 and 3.2 in both morphic and mvc and none of them behave 'properly'. 3.6 & 3.5 morphic shewed notifiers but still crashed. None of the others even shewed a notifier.
tim -- Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Strange OpCodes: EOS: Erase Operating System
On Sun, May 16, 2004 at 10:43:08AM -0700, Tim Rowledge wrote:
"David T. Lewis" lewis@mail.msen.com wrote:
On Wed, Apr 28, 2004 at 10:36:16PM -0600, tim@sumeru.stanford.edu wrote:
One other observation: The problem does not occur in MVC. The combination of Morphic and a Unix/Linux VM is apparently needed to make the problem occur.
Just tried again for 3.7, 3.6, 3.5 and 3.2 in both morphic and mvc and none of them behave 'properly'. 3.6 & 3.5 morphic shewed notifiers but still crashed. None of the others even shewed a notifier.
Interesting. I just tried it again, running on Linux using image version 'Squeak3.7beta of ''1 April 2004'' [latest update: #5868]'. I consistently get the out-of-memory crash in Morphic, but things work correctly in MVC. Well, not quite correctly, there is a horrible bug when traversing from the MVC project back to a Morphic project in which I get a red screen of death on some tabs in the Morphic project - but that would be the topic of a separate thread, and it appears unrelated to the low memory notifier issue.
Overall, we are seeing inconsistent behavior that is possibly dependent on timing and phase of the moon. This one might be tricky to debug.
Dave
I haven't quite gotten to the bottom of this, but here is what I've found so far.
The problem is that the low space notification is executing in the context of the event tickler process, while whatever method is running way with the memory is running in another process and does not get suspended. I can confirm this with OSProcess trace commands embedded in #lowSpaceWatcher, and a hacked up version of #createStackOverflow that forces a halt before the system hangs up.
Furthermore, if the event tickler process is terminated, low space notification starts working as expected. So it must be some interaction of these processes, and it does not happen unless the event tickler is running.
What I can't figure out is how the low space signal is getting handled in the context of the event tickler process, when the low space watcher process is the one that is waiting on the semaphore. The low space signaling in the VM interpreter seems simple and straightforword, so it has to be something on the image side ... I just can't spot it.
Anybody have an idea?
I've attached a copy of the hacks I'm using to debug this in case anyone wants to reproduce the problem without hanging their image.
Dave
On Sun, May 16, 2004 at 10:43:08AM -0700, Tim Rowledge wrote:
"David T. Lewis" lewis@mail.msen.com wrote:
On Wed, Apr 28, 2004 at 10:36:16PM -0600, tim@sumeru.stanford.edu wrote:
This appears to be the same problem mentioned in recent reports of not being able to interrupt a recursive process. If something is making the low space signal not work properly - leaving the runaway process careering like a drunken driver down the roadway of memory loss - then one of our major system safeguards is inneffective. Not good.
[Background on this thread: Squeak hangs up and crashes with out-of-memory condition due to low space alarm not working. Observed on Unix/Linux VM.]
Also on RISC OS and win32 (IIRC)
I can't quite get a handle on what is happening here, but I am fairly sure that it is *not* a problem with the low space signal. The signal is properly created in the VM. It is caught by the waiting low space watcher process, and the notifier is being displayed. *After* these things take place, Squeak crashes with an out of memory condition, as if the original runaway recursive process had proceeded without interruption (although I don't think this is what is literally happening).
I did a whole load of test runs for this a while back and it really does seem like the problem proces keps on running despite the notifier.
One other observation: The problem does not occur in MVC. The combination of Morphic and a Unix/Linux VM is apparently needed to make the problem occur.
Just tried again for 3.7, 3.6, 3.5 and 3.2 in both morphic and mvc and none of them behave 'properly'. 3.6 & 3.5 morphic shewed notifiers but still crashed. None of the others even shewed a notifier.
tim
Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Strange OpCodes: EOS: Erase Operating System
The low space watcher is interrupted in the context of the wrong process when the eventTickler process (or other high priority process) is running. This prevents low space detection from functioning properly.
Here is a one line fix to Project>>interruptName: that corrects the problem.
Dave
"David T. Lewis" lewis@mail.msen.com wrote:
The low space watcher is interrupted in the context of the wrong process when the eventTickler process (or other high priority process) is running. This prevents low space detection from functioning properly.
Here is a one line fix to Project>>interruptName: that corrects the problem.
Certainly within the limited arena of running SystemDictionary>createStackOverflow on my RISC OS machine it works well. No idea yet if it will cover every base, but until then Dave - frellin'A.
Don't suppose you've spotted why Exception>outer is broken, seeing that you're on a roll?
tim -- Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Useful random insult:- Trying out for the javelin retrieval team.
On Sat, May 22, 2004 at 03:26:14PM -0700, Tim Rowledge wrote:
Don't suppose you've spotted why Exception>outer is broken, seeing that you're on a roll?
I don't even know what Exception>>outer would do if it *wasn't* broken, but I see there are some test cases for it, so it must be something worth caring about. Maybe I'll have a look at it just in case it turns out that I actually am on a roll. Or I just could buy a lottery ticket ;)
Dave
On Sat, May 22, 2004 at 03:26:14PM -0700, Tim Rowledge wrote:
"David T. Lewis" lewis@mail.msen.com wrote:
The low space watcher is interrupted in the context of the wrong process when the eventTickler process (or other high priority process) is running. This prevents low space detection from functioning properly.
Here is a one line fix to Project>>interruptName: that corrects the problem.
Certainly within the limited arena of running SystemDictionary>createStackOverflow on my RISC OS machine it works well. No idea yet if it will cover every base, but until then Dave - frellin'A.
Could someone with a Linux/Unix system please also review this fix? Crashing the VM when out of memory is not considered good practice, so it would be a shame to let 3.7 out the door without addressing this.
Dave
This bug effects inexperienced Squeak programmers, who are more likely to accidentally write infinite recursions, and who are less likely to know how to recover their changes when Squeak hangs up or crashes as a result of the bug. The bug was reported September 2003, and a one-line fix was proposed May 2004.
It's a bit tricky to test this because the bug crashes the VM (or hangs it up in infinite memory allocation), but it's only a one-line fix so if someone could please have a look I'd appreciate it.
Even with the fix in place the low space watcher does not kick in to handle the situation of an infinite recursion on windows. I *guess* there is no upper memory bound for squeak on windows (or at least I don't know how to change that) and so the low space watcher process in squeak does not get signaled until swap space runs out. On linux there is a different situation when you start the vm with an explicit memory limit (otherwise it should be the same as on windows). I would say this fix can be included if it does help in this special case, because otherwise it does no harm (but it also does not help much).
I think Avi's StackWatcher (1) works better as a corrective measure for infinite recursion problems than the low space watcher. I rate it as valuable goodie and therefore placed it on SqueakMap for easy acccess. It works fine with the latest 3.7b and it maybe a good candidate for inclusion in 3.8.
Alex
(1) http://lists.squeakfoundation.org/pipermail/squeak-dev/2003-August/06583 2.html
Attached is a change set that I used to debug the stack overflow problem and confirm the fix. This only runs on a Unix VM with OSProcess loaded, but the overflow problem is a bit tricky to debug so I'm posting this in case someone wants to reproduce what I did.
Basically this just writes debug trace messages to standard output so I can keep track of what process is running what method in what order. Just some good ol' fashioned Fortran debugging, but what the heck, it worked.
From the preamble:
This is what I used to debug the stack overflow problem. Load OSProcess first, then load this change set.
Intended for use on Unix/Linux. Run the Squeak vm with a fixed memory allocation (squeak -memory 30m) in order to force the out-of-memory condition.
Open a ProcessBrowser, then evaluate 'Smalltalk createStackOverflow'. You should see messages on stdout that confirm that the runaway recursion keeps going even after the low space semaphore has be signaled.
Now apply the LowSpaceWatcherFix change set, and evaluate 'Smalltalk createStackOverflow'. The low space watcher should catch the runaway method right away.
On Thu, Jul 01, 2004 at 12:10:30PM +0200, Alexander@Lazarevic.de wrote:
Even with the fix in place the low space watcher does not kick in to handle the situation of an infinite recursion on windows. I *guess* there is no upper memory bound for squeak on windows (or at least I don't know how to change that) and so the low space watcher process in squeak does not get signaled until swap space runs out. On linux there is a different situation when you start the vm with an explicit memory limit (otherwise it should be the same as on windows). I would say this
Newer Linux VM's grow memory dynamically, and do not start with any explicity memory limit.
On 02 Jul 2004, at 09:37, Ross Boylan wrote:
Newer Linux VM's grow memory dynamically, and do not start with any explicity memory limit.
There are two command-line options (with equivalent environment variables) to control how memory is allocated on Unix:
If no options are given then memory is allocated dynamically with the limit set at 75% of the available virtual memory.
If -memory N{mk} is given then memory is allocated statically; the argument to the option defines a hard upper limit.
If -mmap N{mk} is given then memory is allocated dynamically, with an explicit upper limit to the amount of memory that will be allocated (but the "75% of available virtual memory" limit still applies).
Ian
squeak-dev@lists.squeakfoundation.org