Hi Marcel, hi all,
I feel that we should really address this issue soon. After we have fixed the debugger chains related to process-faithful debugging, errors from event handlers are the #1 cause for lost images in my experience, and it can really be painful to lose your work because you had a slip or breakpoint in a balloon text selector anywhere. :(
I'm open to anything that works, but we should not let the perfect be the enemy of the good for an issue of that impact. In the past I proposed multiple options:
1. Disable erroneous event handlers (#errorOnEvent, similar to existing #errorOnDraw, #errorOnStep, and #errorOnLayout): Debugger chains may be caused by different events, but most commonly through mouseOver or mouseDrag events. We need to discuss whether we want to cover other events such as #windowMetricChange and #windowPaint, too. Event handlers can be reenabled through the halo debug menu of the relevant morph (which is unfortunately not very discoverable, we could also discuss adding a visual hint for non-drawing errors as an overlay similarly to #drawErrorOn:).
Are there any general objections to that approach? If the only challenge is performance, what benchmarks would be required? Would some simple FrameRateMorph observation while hovering and dragging in an image with a couple of open windows suffice?
2. Generic debugger chain detection: There still might be other situations which spawn endless debuggers, such as when users juggle with many processes. Through a simpe time-based counter heuristic in the ToolSet when handling errors, we could prevent further debuggers when a threshold is exceeded and just suspend further processes. We could discuss how to communicate this to the user - note events in an ensured transcript, leave the project, run an emergency evaluator with highest priority. If we tune the threshold wisely, e.g., 50 debuggers within 30 seconds, I believe that we will not have too many false positives.
A third idea specific to event errors again would be to introduce another shortcut next to Cmd + Dot that triggers Project primitiveError and immediately leaves the project.
We could combine multiple of these ideas.
I am looking forward to your feedback. I would really love to get this one solved, too.
Best, Christoph
On 2020-09-19T19:35:48+00:00, christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi Marcel, hi all,
What's the strategy here? Which events do you want to catch? I think that only mouse-over (incl. enter/leave) events are problematic. For a generic solution, benchmarks are mandatory. Where are we standing at that front?
I think the big question is how generic we would like to be at this point. Beside mouse over/enter/leave events, dragMove events could be a problem, too (though we are planning not to expose them to any morph). Mouse wheel events, too, maybe, especially when they originally come from a touchpad or touch screen input. What is about window events? #windowMetricChange and #windowPaint can be signaled with a high frequency, and when a host system such as Windows is mad again, #windowActivated, too. So in my inbox proposal, I covered all kinds of events and overwrote a #isHighFrequentEvent getter for certain event classes that are known not to be recorded very high-frequently (so basically, an allow list instead of a ban list).
I have been testing these changes for a few months in my image now, and while I did not experience any further debugger chains, there was a quite small number of sporadic false positives, for instance, after switching a project, but unfortunately, I did not take notes about them and probably there is a relationship to any other change in my working copy. :-( In my personal doIt list, I have added the two following lines for quickly fixing the problem:
World allMorphsDo: #resumeAfterEventError. World allMorphsDo: #resumeAfterDrawError.
However, I'm still wondering about a frequency-based debugger detection that could record the last let's say 5 debugger invocation times and if they have been too close, another warning could be displayed. This approach would be less technical but better aligned with the actual purpose of this proposal, which is to prevent a long chain of debuggers overflowing the user. Here is a short, hacked implementation of such a feature for the StandardToolSet:
initialize
RecentDebuggersMutex := Semaphore forMutualExclusion.
RecentDebuggers := OrderedCollection new.
5 timesRepeat: [RecentDebuggers addFirst: DateAndTime new]. ToolSet register: self.
debugProcess: aProcess context: aContext label: aString contents: contents fullView: aBool
| now | (aProcess isTerminated and: [aString beginsWith: 'Debug it']) ifTrue: [ ^ Project uiManager inform: 'Nothing to debug. Process has terminated.\Expression optimized.' withCRs translated].
now := DateAndTime now.
(RecentDebuggersMutex critical: [RecentDebuggers notEmpty ifTrue: [RecentDebuggers removeLast]]) ifNotNil: [:time |
now - time < 1 second ifTrue: [
DebuggerOverflow
ifNil: [
[DebuggerOverflow := Semaphore new.
(Project uiManager confirm: 'A lot of debuggers has been detected in recent past. Would you like to suppress further debuggers?' translated)
ifTrue: [self suppressDebuggers]]
ensure: [
DebuggerOverflow signal.
DebuggerOverflow := nil]]
ifNotNil: [:sem |
sem wait]
]].
RecentDebuggersMutex critical: [RecentDebuggers addFirst: now].
self flag: #todo. "'Mutex' is not always signaled correctly"
(self shouldSuppressDebuggers ==> [Project uiManager
confirm: ('This will open a Debugger:\\{1}\{2}\{3}\\Proceed?' withCRs translated format: {
aProcess. aContext. contents })
title: aString])
ifFalse: [aProcess terminate. ^ self]. ^ Debugger openOn: aProcess context: aContext label: aString contents: contents fullView: aBool
This has saved my image two or three times, too, but on the contrary, the quality of such a heuristic may be questionary ...
Looking forward to your opinions! :-)
Best,
Christoph
Von: Squeak-dev <squeak-dev-bounces(a)lists.squeakfoundation.org> im Auftrag von Taeumel, Marcel Gesendet: Montag, 8. Juni 2020 12:07 Uhr An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Morphic-ct.1641.mcz
Everything: http://forum.world.st/template/NamlServlet.jtp?macro=search_page&node=45... ;-)
Are those just pointers to your commit messages?
Best, Marcel
Am 08.06.2020 12:01:54 schrieb Thiede, Christoph <christoph.thiede(a)student.hpi.uni-potsdam.de>:
Hi Marcel,
thank you! :-)
Discussion: http://forum.world.st/bug-in-a-ToolBuilder-Squeak5-3rc2-td5112536.html#a5112... Inbox: Morphic-ct.1636, Morphic-ct.1638, Morphic-ct.1641
Everything: http://forum.world.st/template/NamlServlet.jtp?macro=search_page&node=45... ;-)
Best,
Christoph
http://www.hpi.de/ ________________________________ Von: Squeak-dev <squeak-dev-bounces(a)lists.squeakfoundation.org> im Auftrag von Taeumel, Marcel Gesendet: Montag, 8. Juni 2020 11:52:32 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Morphic-ct.1641.mcz
Hi Christoph,
can you point me to the prior inbox artifacts and list discussions/comments about this "safe event processing"? I would really like to take a look at it. Having some safety net for broken #mouseOver: handlers similar to #displayWorldSafely: would really help debugging.
Best, Marcel
Am 07.06.2020 14:46:14 schrieb commits(a)source.squeak.org <commits(a)source.squeak.org>:
Christoph Thiede uploaded a new version of Morphic to project The Inbox: http://source.squeak.org/inbox/Morphic-ct.1641.mcz
==================== Summary ====================
Name: Morphic-ct.1641 Author: ct Time: 7 June 2020, 2:45:17.536284 pm UUID: b90e23e1-ce70-3d49-a8e4-f09af148ef97 Ancestors: Morphic-ct.1638
Refine #isHighFrequentEvent implementation for further event types. Fixes one possible cause of the "color depth = 0" bug. See http://forum.world.st/Image-not-startable-after-save-td5117084.html.
Depends indeed on Morphic-ct.1638.
=============== Diff against Morphic-ct.1638 ===============
Item was added:
- ----- Method: DropEvent>>isHighFrequentEvent (in category 'testing') -----
- isHighFrequentEvent
- ^ false!
Item was added:
- ----- Method: WindowEvent>>isHighFrequentEvent (in category 'testing') -----
- isHighFrequentEvent
- ^ (#(windowClose) includes: self type) not!
Item was changed: ----- Method: WorldState>>doSafely:onErrorThat:setErrorFlag:ifFatal:afterErrorDo: (in category 'update cycle') ----- doSafely: aBlock onErrorThat: errorPredicate setErrorFlag: errorFlag ifFatal: fatalErrorBlock afterErrorDo: postErrorBlock "Evaluate aBlock and keep track of errors during morph invocations."
| finished classesWithErrors | finished := false. classesWithErrors := IdentitySet new. [finished] whileFalse: [ [aBlock value. finished := true] on: Error, Halt, Warning do: [:ex | | err rcvr errCtxt errMorph | (errorPredicate cull: ex) ifFalse: [ex pass]. err := ex description. rcvr := ex receiver.
errCtxt := thisContext. [ errCtxt := errCtxt sender. "Search the sender chain to find the morph causing the problem" [errCtxt notNil and: [(errCtxt receiver isMorph) not]] whileTrue: [errCtxt := errCtxt sender]. "If we're at the root of the context chain then we have a fatal problem" errCtxt ifNil: [^ fatalErrorBlock cull: err].
- errMorph := errCtxt receiver
- ] doWhileTrue: [
- errMorph := errCtxt receiver.
"If the morph causing the problem has already the error flag set, then search for the next morph above in the caller chain." errMorph hasProperty: errorFlag
- ].
- ] whileTrue.
errMorph setProperty: errorFlag toValue: true; changed.
"Catch all errors, one for each receiver class." (classesWithErrors includes: rcvr class) ifFalse: [ classesWithErrors add: rcvr class. ToolSet debugException: ex].
postErrorBlock cull: err. ]].!
Item was changed: ----- Method: WorldState>>processEventsSafely: (in category 'update cycle') ----- processEventsSafely: aHandMorph
^ self doSafely: [aHandMorph processEvents]
- onErrorThat: [:error | self currentEvent ifNil: [true] ifNotNil: [:evt | evt isHighFrequentEvent]]
- onErrorThat: [:error | ActiveEvent isNil or: [ActiveEvent isHighFrequentEvent]]
setErrorFlag: #errorOnEvent ifFatal: [:error | Project current fatalEventHandlingError: error] afterErrorDo: []!