#fork and deterministic resumption of the resulting process

List overview All Threads
Download

newer

older

Re: pragmas usage

New Smalltalk :)

Andreas Raab

4 Feb 2008 4 Feb '08

10:04 p.m.

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

For the rest of us, the problem is that #fork in the above is not deterministic in the way that there is no guarantee whether the "worker" variable will be assigned when we enter the worker loop. It *would* be deterministic if the priority were below or above the current process' priority but when it's the same it can be affected by environmental effects (external signals, delay processing etc) leading to some very obscure runtime problems (in the above, the process would just not start).

To fix this problem I have changed BlockContext>>fork and BlockContext>>forkAt: to read, e.g.,

BlockContext>>fork "Create and schedule a Process running the code in the receiver." ^self forkAt: Processor activePriority

BlockContext>>forkAt: priority "Create and schedule a Process running the code in the receiver at the given priority. Answer the newly created process." | forkedProcess helperProcess | forkedProcess := self newProcess. forkedProcess priority: priority. priority = Processor activePriority ifTrue:[ helperProcess := [forkedProcess resume] newProcess. helperProcess priority: priority-1. helperProcess resume. ] ifFalse:[ forkedProcess resume ]. ^forkedProcess

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

Cheers, - Andreas

Show replies by date

merlyn＠stonehenge.com

4 Feb 4 Feb

10:24 p.m.

...

...
...
...
...
"Andreas" == Andreas Raab andreas.raab@gmx.de writes:

Andreas> What do people think about this?

Looks good to me. And no, I didn't spot it until I read your explanation, but this does seem like something that could trip up a beginner.

-- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 merlyn@stonehenge.com URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Mathieu Suen

10:41 p.m.

Hi,

On Feb 4, 2008, at 10:04 PM, Andreas Raab wrote:

...

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

Why don't you make the assignment atomic?

...

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

For the rest of us, the problem is that #fork in the above is not deterministic in the way that there is no guarantee whether the "worker" variable will be assigned when we enter the worker loop. It *would* be deterministic if the priority were below or above the current process' priority but when it's the same it can be affected by environmental effects (external signals, delay processing etc) leading to some very obscure runtime problems (in the above, the process would just not start).

To fix this problem I have changed BlockContext>>fork and BlockContext>>forkAt: to read, e.g.,

BlockContext>>fork "Create and schedule a Process running the code in the receiver." ^self forkAt: Processor activePriority

BlockContext>>forkAt: priority "Create and schedule a Process running the code in the receiver at the given priority. Answer the newly created process." | forkedProcess helperProcess | forkedProcess := self newProcess. forkedProcess priority: priority. priority = Processor activePriority ifTrue:[ helperProcess := [forkedProcess resume] newProcess. helperProcess priority: priority-1. helperProcess resume. ] ifFalse:[ forkedProcess resume ]. ^forkedProcess

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

Cheers,

Andreas

Mth

tim Rowledge

10:49 p.m.

On 4-Feb-08, at 1:04 PM, Andreas Raab wrote:

...

What do people think about this?

I'm not too terribly keen on the idea of a process I fork getting set to a lower priority than that that I requested. Then again, I don't all that often feel the need to fork processes anyway so perhaps I'm not really entitled to a vote.

I think I'd categorise this example as a bug, plain and simple. Don't do that. It's not a nice idiom at all.

To ameliorate the situation, we could *not* return the process from the #fork method - thus making it pointless to write foo:= [blah] fork. I'm sure that would upset some people that are to attached to pretending to be in unix-land.

I'd like to hope that something like self critical:[foo:= [blah] fork] might be acceptable as a replacement idiom. There are times when people simply have to accept that the simple looking way to do something is just plain wrong.

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim He who hesitates is probably right.

tim Rowledge

10:58 p.m.

On 4-Feb-08, at 1:49 PM, tim Rowledge wrote:

...

On 4-Feb-08, at 1:04 PM, Andreas Raab wrote:

...
What do people think about this?

I'm not too terribly keen on the idea of a process I fork getting set to a lower priority than that that I requested.

Oh and of course the first thing that will come along to spoil the party is some smart-arse doing [foo] forkAt: Processor activePriority+1 ;-)

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Life by Norman Rockwell, but screenplay by Stephen King.

Michael van der Gulik

11:08 p.m.

On Feb 5, 2008 10:49 AM, tim Rowledge tim@rowledge.org wrote:

...

On 4-Feb-08, at 1:04 PM, Andreas Raab wrote:

...
What do people think about this?

I'm not too terribly keen on the idea of a process I fork getting set to a lower priority than that that I requested. Then again, I don't all that often feel the need to fork processes anyway so perhaps I'm not really entitled to a vote.

I think I'd categorise this example as a bug, plain and simple. Don't do that. It's not a nice idiom at all.

To ameliorate the situation, we could *not* return the process from the #fork method - thus making it pointless to write foo:= [blah] fork. I'm sure that would upset some people that are to attached to pretending to be in unix-land.

It would upset me. All my code would break and I might cry.

...

I'd like to hope that something like self critical:[foo:= [blah] fork]

I've never heard of a #critical: method outside the Semaphore class. I'm assuming that self is a Semaphore? In that case, the forked/child process is not constrained by the critical: block and will escape to begin its loop with no guarantee that foo has been assigned.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Jason Johnson

15 Mar 15 Mar

5:23 p.m.

New subject: #fork and deterministic resumption of the resulting process

On Mon, Feb 4, 2008 at 10:49 PM, tim Rowledge tim@rowledge.org wrote:

...

There are times when people simply have to accept that the simple looking way to do something is just plain wrong.

+1. If you could figure out a way to get people to actually do this then we could change the software world. :)

Michael van der Gulik

4 Feb 4 Feb

11 p.m.

On Feb 5, 2008 10:04 AM, Andreas Raab andreas.raab@gmx.de wrote:

...

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

For the rest of us, the problem is that #fork in the above is not deterministic in the way that there is no guarantee whether the "worker" variable will be assigned when we enter the worker loop. It *would* be deterministic if the priority were below or above the current process' priority but when it's the same it can be affected by environmental effects (external signals, delay processing etc) leading to some very obscure runtime problems (in the above, the process would just not start).

To fix this problem I have changed BlockContext>>fork and BlockContext>>forkAt: to read, e.g.,

BlockContext>>fork "Create and schedule a Process running the code in the receiver." ^self forkAt: Processor activePriority

BlockContext>>forkAt: priority "Create and schedule a Process running the code in the receiver at the given priority. Answer the newly created process." | forkedProcess helperProcess | forkedProcess := self newProcess. forkedProcess priority: priority. priority = Processor activePriority ifTrue:[ helperProcess := [forkedProcess resume] newProcess. helperProcess priority: priority-1. helperProcess resume. ] ifFalse:[ forkedProcess resume ]. ^forkedProcess

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

I'm thinking that the above is an ugly hack. When we eventually write an interpreter which is truly multitasking, your original bug will re-appear.

What you wanted to do, rather than redefining the Process class, is:

MyClass>>startWorkerProcess "keepRunning is an instance variable" keepRunning := true. " Always run worker processes as a lower priority than the controller process. " [self runWorkerProcess] forkAt: ProcessScheduler somethingeratherPriority. " sorry; I don't have Squeak handy right now, but you get the idea. "

MyClass>>runWorkerProcess "Run the worker process" [keepRunning] whileTrue: [ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" keepRunning := false. "let it terminate itself"

Or better, make an abstraction:

process := WorkerTask doing: [ someObject doSomeWork ]. process start. process stop.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Keith Hodges

11:09 p.m.

Andreas Raab wrote:

...

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

In this situation the idiom I use is something like this

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] newProcess. worker resume.

:-)

Keith

Terry Raymond

11:36 p.m.

Being aware of the problem Andreas illustrated, I frequently write;

workerProcess := [self runWorkerProcess] newProcess. workerProcess resume.

I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

However, being that its operation is not too obvious you could always write another method, maybe #forkDeferred, to accomplish what Andreas is proprosing.

Terry

=========================================================== Terry Raymond Crafted Smalltalk 80 Lazywood Ln. Tiverton, RI 02878 (401) 624-4517 traymond@craftedsmalltalk.com http://www.craftedsmalltalk.com ===========================================================

...

-----Original Message----- From: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev- bounces@lists.squeakfoundation.org] On Behalf Of Andreas Raab Sent: Monday, February 04, 2008 4:04 PM To: The general-purpose Squeak developers list Subject: #fork and deterministic resumption of the resulting process

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

For the rest of us, the problem is that #fork in the above is not deterministic in the way that there is no guarantee whether the "worker" variable will be assigned when we enter the worker loop. It *would* be deterministic if the priority were below or above the current process' priority but when it's the same it can be affected by environmental effects (external signals, delay processing etc) leading to some very obscure runtime problems (in the above, the process would just not start).

To fix this problem I have changed BlockContext>>fork and BlockContext>>forkAt: to read, e.g.,

BlockContext>>fork "Create and schedule a Process running the code in the receiver." ^self forkAt: Processor activePriority

BlockContext>>forkAt: priority "Create and schedule a Process running the code in the receiver at the given priority. Answer the newly created process." | forkedProcess helperProcess | forkedProcess := self newProcess. forkedProcess priority: priority. priority = Processor activePriority ifTrue:[ helperProcess := [forkedProcess resume] newProcess. helperProcess priority: priority-1. helperProcess resume. ] ifFalse:[ forkedProcess resume ]. ^forkedProcess

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

Cheers,

Andreas

Andreas Raab

5 Feb 5 Feb

12:51 a.m.

Terry Raymond wrote:

...

I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

I don't think you (as well as some of the other posters) really understand what these changes do. To illustrate:

p1 := p2 := p3 := nil. p1 := [p1Valid := p1 notNil] forkAt: Processor activePriority-1. p2 := [p2Valid := p2 notNil] forkAt: Processor activePriority. p3 := [p3Valid := p3 notNil] forkAt: Processor activePriority+1.

Both, the first and the last case are currently deterministic and will stay that way: p1 will be consistently non-nil when this code is run; p3 will be consistently nil when run. This is currently the case and remains the same after applying my changes.

In the second case however, p2 may or may not be nil, depending on whether external interrupts occur. Since that is rare, in 99.99+% of all cases p2 will be non-nil.

What I am proposing is simply to make p2 non-nil in 100% of the cases. There is no change to those parts of the existing semantics that are actually well-defined. The only change is that it takes a rare non-deterministic occurrence and makes the overall behavior consistent in this case.

Cheers, - Andreas

Michael van der Gulik

1:51 a.m.

On Feb 5, 2008 12:51 PM, Andreas Raab andreas.raab@gmx.de wrote:

...

Terry Raymond wrote:

...
I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

I don't think you (as well as some of the other posters) really understand what these changes do. To illustrate:

p1 := p2 := p3 := nil. p1 := [p1Valid := p1 notNil] forkAt: Processor activePriority-1. p2 := [p2Valid := p2 notNil] forkAt: Processor activePriority. p3 := [p3Valid := p3 notNil] forkAt: Processor activePriority+1.

Both, the first and the last case are currently deterministic and will stay that way: p1 will be consistently non-nil when this code is run; p3 will be consistently nil when run. This is currently the case and remains the same after applying my changes.

In the second case however, p2 may or may not be nil, depending on whether external interrupts occur. Since that is rare, in 99.99+% of all cases p2 will be non-nil.

What I am proposing is simply to make p2 non-nil in 100% of the cases. There is no change to those parts of the existing semantics that are actually well-defined. The only change is that it takes a rare non-deterministic occurrence and makes the overall behavior consistent in this case.

You're relying on the current implementation of the scheduler. If the implementation of the scheduler changes (such as would happen when Squeak is made capable of using multi-cored or multiple CPUs) then your bug will re-appear and your "fix" will no longer fix the problem 100% of the time.

Process>>fork should fork the process at the same priority as the calling process. Process>>forkAt: should fork the process at the given priority. If they use other priorities than this, then I consider it a bug.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Andreas Raab

3:30 a.m.

Michael van der Gulik wrote:

...

What I am proposing is simply to make p2 non-nil in 100% of the cases.
There is no change to those parts of the existing semantics that are
actually well-defined. The only change is that it takes a rare
non-deterministic occurrence and makes the overall behavior consistent
in this case.
You're relying on the current implementation of the scheduler. If the implementation of the scheduler changes (such as would happen when Squeak is made capable of using multi-cored or multiple CPUs) then your bug will re-appear and your "fix" will no longer fix the problem 100% of the time.

Indeed, I'm trying to fix it in the context of the *current* implementation of the scheduler. If you change the scheduler there will be a variety of new issues of which this is by far the smallest.

Cheers, - Andreas

Mathieu Suen

2:19 a.m.

On Feb 5, 2008, at 12:51 AM, Andreas Raab wrote:

...

Terry Raymond wrote:

...
I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

I don't think you (as well as some of the other posters) really understand what these changes do. To illustrate:

p1 := p2 := p3 := nil. p1 := [p1Valid := p1 notNil] forkAt: Processor activePriority-1. p2 := [p2Valid := p2 notNil] forkAt: Processor activePriority. p3 := [p3Valid := p3 notNil] forkAt: Processor activePriority+1.

Both, the first and the last case are currently deterministic and will stay that way: p1 will be consistently non-nil when this code is run; p3 will be consistently nil when run. This is currently the case and remains the same after applying my changes.

In the second case however, p2 may or may not be nil, depending on whether external interrupts occur. Since that is rare, in 99.99+% of all cases p2 will be non-nil.

What I am proposing is simply to make p2 non-nil in 100% of the cases. There is no change to those parts of the existing semantics that are actually well-defined. The only change is that it takes a rare non-deterministic occurrence and makes the overall behavior consistent in this case.

If something could happen it will happen. Reading the code above now I expect it not to be determinist for p2. I admit the first time I expected p2 to get a value. So now I prefer to have a critical section regarding p2.

I would even wrote the critical section for p1 and p3 since in most of the scheduler the code is non determinist. Especially for p3. The scheduler may not yield the current thread for performance reason. (But not in Squeak)

So I vote for the critical section.

...

Cheers,

Andreas

Mth

Joshua Gargus

2:47 a.m.

On Feb 4, 2008, at 5:19 PM, Mathieu Suen wrote:

...

On Feb 5, 2008, at 12:51 AM, Andreas Raab wrote:

...
Terry Raymond wrote:

...
I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

I don't think you (as well as some of the other posters) really understand what these changes do. To illustrate:

p1 := p2 := p3 := nil. p1 := [p1Valid := p1 notNil] forkAt: Processor activePriority-1. p2 := [p2Valid := p2 notNil] forkAt: Processor activePriority. p3 := [p3Valid := p3 notNil] forkAt: Processor activePriority+1.

Both, the first and the last case are currently deterministic and will stay that way: p1 will be consistently non-nil when this code is run; p3 will be consistently nil when run. This is currently the case and remains the same after applying my changes.

In the second case however, p2 may or may not be nil, depending on whether external interrupts occur. Since that is rare, in 99.99+% of all cases p2 will be non-nil.

What I am proposing is simply to make p2 non-nil in 100% of the cases. There is no change to those parts of the existing semantics that are actually well-defined. The only change is that it takes a rare non-deterministic occurrence and makes the overall behavior consistent in this case.

If something could happen it will happen. Reading the code above now I expect it not to be determinist for p2. I admit the first time I expected p2 to get a value. So now I prefer to have a critical section regarding p2.

I would even wrote the critical section for p1 and p3 since in most of the scheduler the code is non determinist. Especially for p3. The scheduler may not yield the current thread for performance reason. (But not in Squeak)

So I vote for the critical section.

I'm not sure what you mean by critical section. I'm assuming that you mean to use explicit synchronization. For example (using TMutex from Croquet, sorry):

mutex := TMutex new. p2 := nil. mutex critical: [ p2 := [ mutex critical: [p2Valid := p2 notNil] ] forkAt: Processor activePriority. ]

In other words, the programmer should be smart enough to do the appropriate synchronization. Is this what you mean? That doesn't seem unreasonable to me (and I say that as the guy who wrote the broken code that started Andreas on this topic :-) ). I'm a little uncomfortable with the notion of not giving processes the priority explicitly requested by the programmer.

Josh

...

...
Cheers,

Andreas

Mth

Mathieu Suen

3:07 a.m.

On Feb 5, 2008, at 2:47 AM, Joshua Gargus wrote:

...

On Feb 4, 2008, at 5:19 PM, Mathieu Suen wrote:

...
On Feb 5, 2008, at 12:51 AM, Andreas Raab wrote:

...
Terry Raymond wrote:

...
I think changing the semantics of fork is a way to introduce a bug to existing code, i.e. one that is aware of the way fork works and relies on it to work that way.

I don't think you (as well as some of the other posters) really understand what these changes do. To illustrate:

p1 := p2 := p3 := nil. p1 := [p1Valid := p1 notNil] forkAt: Processor activePriority-1. p2 := [p2Valid := p2 notNil] forkAt: Processor activePriority. p3 := [p3Valid := p3 notNil] forkAt: Processor activePriority+1.

Both, the first and the last case are currently deterministic and will stay that way: p1 will be consistently non-nil when this code is run; p3 will be consistently nil when run. This is currently the case and remains the same after applying my changes.

In the second case however, p2 may or may not be nil, depending on whether external interrupts occur. Since that is rare, in 99.99+% of all cases p2 will be non-nil.

What I am proposing is simply to make p2 non-nil in 100% of the cases. There is no change to those parts of the existing semantics that are actually well-defined. The only change is that it takes a rare non-deterministic occurrence and makes the overall behavior consistent in this case.

If something could happen it will happen. Reading the code above now I expect it not to be determinist for p2. I admit the first time I expected p2 to get a value. So now I prefer to have a critical section regarding p2.

I would even wrote the critical section for p1 and p3 since in most of the scheduler the code is non determinist. Especially for p3. The scheduler may not yield the current thread for performance reason. (But not in Squeak)

So I vote for the critical section.

I'm not sure what you mean by critical section. I'm assuming that you mean to use explicit synchronization. For example (using TMutex from Croquet, sorry):

mutex := TMutex new. p2 := nil. mutex critical: [ p2 := [ mutex critical: [p2Valid := p2 notNil] ] forkAt: Processor activePriority. ]

Yes.

...

In other words, the programmer should be smart enough to do the appropriate synchronization. Is this what you mean? That doesn't seem unreasonable to me (and I say that as the guy who wrote the broken code that started Andreas on this topic :-) ). I'm a little uncomfortable with the notion of not giving processes the priority explicitly requested by the programmer.

Btw priority should only give a flavor of how often a process should run, is not a way to make sure that one process will run before an other. IMHO

...

Josh

...
...
Cheers,

Andreas

Mth

Mth

Joshua Gargus

6:10 p.m.

On Feb 4, 2008, at 5:47 PM, Joshua Gargus wrote:

...

I'm a little uncomfortable with the notion of not giving processes the priority explicitly requested by the programmer.

I obviously didn't read Andreas's proposal closely enough. With more than a glance, it is clear that a lower-priority helper process is used only for a moment to start up the real process at the priority requested by the user.

The only problem that I can see with the proposal is that under "heavy load", there may always be a runnable process of higher priority so that the helper process is starved. Would this be a problem for anyone in practice? You could always work around it this with:

desiredPriority := Processor activePriority. [worker := [self runWorkerProcess] forkAt: desiredPriority] forkAt: desiredPriority+1

Josh

tim Rowledge

6:30 p.m.

On 5-Feb-08, at 9:10 AM, Joshua Gargus wrote:

...

On Feb 4, 2008, at 5:47 PM, Joshua Gargus wrote:

...
I'm a little uncomfortable with the notion of not giving processes the priority explicitly requested by the programmer.

I obviously didn't read Andreas's proposal closely enough. With more than a glance, it is clear that a lower-priority helper process is used only for a moment to start up the real process at the priority requested by the user.

Dang! You're right! Reminder to self - avoid commenting after a quick look at densely written code.

I agree that there is surely still a moderate likelihood problem here in that the helper process, being a lower priority, is not guaranteed to run anytime soon. If there are several processes at the current priority then they *all* have to get suspended before the helper can run and complete the fork. Your suggestion might solve that. I think.

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim To iterate is human; to recurse, divine.

Igor Stasenko

3:21 a.m.

On 04/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

Hmm, IMO, you wanting to kill two rabbits in one shot..

Why not write like following:

MyClass>>startWorkerProcess "worker is an instance variable" running := true. worker := [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [running] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself"

Yes, you will need an additional inst var - 'running', but i think it's reasonable: controlling a process in context of scheduler operations, where you need it's handle, and controlling when it's should terminate graciously (by setting running flag to false), is different things.

-- Best regards, Igor Stasenko AKA sig.

Joshua Gargus

5:33 p.m.

On Feb 4, 2008, at 6:21 PM, Igor Stasenko wrote:

...

On 04/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...
Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

Hmm, IMO, you wanting to kill two rabbits in one shot..

Why not write like following:

MyClass>>startWorkerProcess "worker is an instance variable" running := true. worker := [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [running] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself"

This doesn't work either (and is actually the reason that the original, broken pattern described by Andreas was used). Consider the following:

inst := MyClass new. inst startWorkerProcess; stopWorkerProcess; startWorkerProcess

If the first worker process happens not to notice that 'running' was set to false for a moment, then you will have two processes running. By comparing against the worker process, you avoid this problem.

Josh

...

Yes, you will need an additional inst var - 'running', but i think it's reasonable: controlling a process in context of scheduler operations, where you need it's handle, and controlling when it's should terminate graciously (by setting running flag to false), is different things.

-- Best regards, Igor Stasenko AKA sig.

Igor Stasenko

5:49 p.m.

On 05/02/2008, Joshua Gargus schwa@fastmail.us wrote:

...

On Feb 4, 2008, at 6:21 PM, Igor Stasenko wrote:

...
On 04/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...
Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

Hmm, IMO, you wanting to kill two rabbits in one shot..

Why not write like following:

MyClass>>startWorkerProcess "worker is an instance variable" running := true. worker := [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [running] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself"

This doesn't work either (and is actually the reason that the original, broken pattern described by Andreas was used). Consider the following:

inst := MyClass new. inst startWorkerProcess; stopWorkerProcess; startWorkerProcess

If the first worker process happens not to notice that 'running' was set to false for a moment, then you will have two processes running. By comparing against the worker process, you avoid this problem.

It's really depends on implementation and your needs. If you want to make sure that after issuing stopWorkerProcess a worker process don't running, you can modify a stopWorkerProcess to look like:

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself" self waitForProcessToFinish

Or, you can place the wait at the beginning of startWorkerProcess method. There are vague set of options how you can control the execution. But it's false to assume that you can avoid such details when dealing with it. You _should_ know, how things rolling and how they fit your tasks, otherwise you will break everything.

-- Best regards, Igor Stasenko AKA sig.

Joshua Gargus

6:14 p.m.

On Feb 5, 2008, at 8:49 AM, Igor Stasenko wrote:

...

On 05/02/2008, Joshua Gargus schwa@fastmail.us wrote:

...
On Feb 4, 2008, at 6:21 PM, Igor Stasenko wrote:

...
On 04/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...
Hi -

In my never-ending quest for questionable behavior in multi- threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

Hmm, IMO, you wanting to kill two rabbits in one shot..

Why not write like following:

MyClass>>startWorkerProcess "worker is an instance variable" running := true. worker := [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [running] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself"

This doesn't work either (and is actually the reason that the original, broken pattern described by Andreas was used). Consider the following:

inst := MyClass new. inst startWorkerProcess; stopWorkerProcess; startWorkerProcess

If the first worker process happens not to notice that 'running' was set to false for a moment, then you will have two processes running. By comparing against the worker process, you avoid this problem.

It's really depends on implementation and your needs. If you want to make sure that after issuing stopWorkerProcess a worker process don't running, you can modify a stopWorkerProcess to look like:

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself" self waitForProcessToFinish

Or, you can place the wait at the beginning of startWorkerProcess method. There are vague set of options how you can control the execution. But it's false to assume that you can avoid such details when dealing with it. You _should_ know, how things rolling and how they fit your tasks, otherwise you will break everything.

-- Best regards, Igor Stasenko AKA sig.

Joshua Gargus

6:24 p.m.

Oops, sorry about the last message, I hit "reply" too soon by accident,

On Feb 5, 2008, at 8:49 AM, Igor Stasenko wrote:

...

On 05/02/2008, Joshua Gargus schwa@fastmail.us wrote:

...
On Feb 4, 2008, at 6:21 PM, Igor Stasenko wrote:

...
On 04/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...
Hi -

In my never-ending quest for questionable behavior in multi- threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

Hmm, IMO, you wanting to kill two rabbits in one shot..

Why not write like following:

MyClass>>startWorkerProcess "worker is an instance variable" running := true. worker := [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [running] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself"

This doesn't work either (and is actually the reason that the original, broken pattern described by Andreas was used). Consider the following:

inst := MyClass new. inst startWorkerProcess; stopWorkerProcess; startWorkerProcess

If the first worker process happens not to notice that 'running' was set to false for a moment, then you will have two processes running. By comparing against the worker process, you avoid this problem.

It's really depends on implementation and your needs.

Of course.

...

If you want to make sure that after issuing stopWorkerProcess a worker process don't running, you can modify a stopWorkerProcess to look like:

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself" self waitForProcessToFinish

(implementation-specific comment, not really on-topic for the thread) This is undesirable because you block the "controller" process while it could be doing other useful work.

...

Or, you can place the wait at the beginning of startWorkerProcess method. There are vague set of options how you can control the execution. But it's false to assume that you can avoid such details when dealing with it.

I don't think that anybody is making that assumption.

...

You _should_ know, how things rolling and how they fit your tasks, otherwise you will break everything.

Sounds perfect, except for these little things called "bugs". :-)

Josh

...

-- Best regards, Igor Stasenko AKA sig.

Igor Stasenko

6:50 p.m.

On 05/02/2008, Joshua Gargus schwa@fastmail.us wrote: [skip]

...

...
If you want to make sure that after issuing stopWorkerProcess a worker process don't running, you can modify a stopWorkerProcess to look like:

MyClass>>stopWorkerProcess "Stop the worker process" running := false. "let it terminate itself" self waitForProcessToFinish

(implementation-specific comment, not really on-topic for the thread) This is undesirable because you block the "controller" process while it could be doing other useful work.

Again, this may or may not suit your needs. If your worker process operating with another state, which should be accessed exclusively, best way is to wait until it finishes before running another process, rather than enclose every bit of code with semaphores and critical sections.

-- Best regards, Igor Stasenko AKA sig.

Paolo Bonzini

10:06 a.m.

...

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

Curiously, I prefer to have the *opposite* behavior in #fork, i.e. always resume the forked process if the priority is equal. Note that this wouldn't be an ultimate fix, because it could lead to an equally wrong idiom,

MyClass>>startWorkerProcess "worker is an instance variable" [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" worker := Processor activeProcess. [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

I'm with Terry on the correct idiom to use, i.e.

workerProcess := [self runWorkerProcess] newProcess. workerProcess resume.

if you really do not want to use a separate instance variable. Another possibility is to signal a Notification:

| running | [running := true. [running] whileTrue: [ ... ]] on: StopRunning do: [ :ex | running := false. ex resume ].

Paolo

Andreas Raab

10:29 a.m.

Paolo Bonzini wrote:

...

I'm with Terry on the correct idiom to use, i.e.
workerProcess := [self runWorkerProcess] newProcess.
workerProcess resume.

Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Cheers, - Andreas

Bert Freudenberg

11:02 a.m.

On Feb 5, 2008, at 10:29 , Andreas Raab wrote:

...

Paolo Bonzini wrote:

...
I'm with Terry on the correct idiom to use, i.e. workerProcess := [self runWorkerProcess] newProcess. workerProcess resume.

Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Well, you should give us a bit more than a few hours ;) Until now most posters did not even understand the proposal.

I for one would appreciate getting your fix in. It does not change the current semantics, and makes one very common idiom (var := [...] fork) safer to use. There may be better idioms, granted. However, for now Squeak's scheduling policy is beautifully deterministic, and I like keeping simple things simple.

- Bert -

Mathieu Suen

11:27 a.m.

On Feb 5, 2008, at 11:02 AM, Bert Freudenberg wrote:

...

On Feb 5, 2008, at 10:29 , Andreas Raab wrote:

...
Paolo Bonzini wrote:

...
I'm with Terry on the correct idiom to use, i.e. workerProcess := [self runWorkerProcess] newProcess. workerProcess resume.

Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Well, you should give us a bit more than a few hours ;) Until now most posters did not even understand the proposal.

Why?

...

I for one would appreciate getting your fix in. It does not change the current semantics, and makes one very common idiom (var := [...] fork) safer to use. There may be better idioms, granted. However, for now Squeak's scheduling policy is beautifully deterministic, and I like keeping simple things simple.

Bert -

Mth

Andreas Raab

7:15 p.m.

Bert Freudenberg wrote:

...

...
[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Well, you should give us a bit more than a few hours ;) Until now most posters did not even understand the proposal.

That's part of the reason why I won't pursue these changes here. To me these changes are just as important as the ones that I posted for Delay and Semaphore. However, unless one understands the kinds of problems that are caused by the current code it is pointless to argue that fixing them is important - I'm sure that unless people had been bitten by Delay and Semaphore we would have the same kinds of debates with all sorts of well-meant advise on how you "ought" to write your code ;-)

[The obvious problem with this advice is that these fixes are not necessarily only to fix *my* code but that of *other* people. I only got started down this path after I saw similar patterns with three different sets of author initials on them. In other words, the problem is far more than any individuals shortcoming and fixing it in general means that it will be fixed for any new people working on our projects]

Cheers, - Andreas

Paolo Bonzini

7:26 p.m.

...

That's part of the reason why I won't pursue these changes here. To me these changes are just as important as the ones that I posted for Delay and Semaphore. However, unless one understands the kinds of problems that are caused by the current code it is pointless to argue that fixing them is important - I'm sure that unless people had been bitten by Delay and Semaphore we would have the same kinds of debates with all sorts of well-meant advise on how you "ought" to write your code ;-)

It's not that I don't think it's important. I think the *bugs* are important to fix, but that the root cause just *cannot* be fixed. It's just that:

1) the many people who made the same mistake maybe were just cutting'n'pasting buggy code;

2) especially, the fix is not 100% safe unless I'm mistaken.

Paolo

Andreas Raab

7:51 p.m.

Paolo Bonzini wrote:

...

...
That's part of the reason why I won't pursue these changes here. To me these changes are just as important as the ones that I posted for Delay and Semaphore. However, unless one understands the kinds of problems that are caused by the current code it is pointless to argue that fixing them is important - I'm sure that unless people had been bitten by Delay and Semaphore we would have the same kinds of debates with all sorts of well-meant advise on how you "ought" to write your code ;-)

It's not that I don't think it's important. I think the *bugs* are important to fix, but that the root cause just *cannot* be fixed.

This completely depends on your definition of "root cause" and "cannot". For me, it's the fact that fork will behave in 99.99% of the cases in one way and in 0.01% in a different way. That kind of non-determinism is probably the root cause for many lingering bugs in our system and it *can* be eliminated.

...

It's just that:

the many people who made the same mistake maybe were just

cutting'n'pasting buggy code;

That is of course a possibility but unless you think the majority of people recognized the bug in the code snippet I posted, I fail to see how this makes a difference.

...

especially, the fix is not 100% safe unless I'm mistaken.

What do you mean by "100% safe"? It is 100% deterministic (which is what I care about); I'm not sure what you mean when you use the term "safe" here.

Cheers, - Andreas

Paolo Bonzini

7:55 p.m.

...

...

especially, the fix is not 100% safe unless I'm mistaken.

What do you mean by "100% safe"? It is 100% deterministic (which is what I care about); I'm not sure what you mean when you use the term "safe" here.

It is not. Whether the low-priority process actually starts depends on external factors. If you have two priority-40 processes, they might prevent the priority-39 process to start and resume the forked process. Of course, unless I'm mistaken.

Paolo

Andreas Raab

8:35 p.m.

Paolo Bonzini wrote:

...

...
...

especially, the fix is not 100% safe unless I'm mistaken.

What do you mean by "100% safe"? It is 100% deterministic (which is what I care about); I'm not sure what you mean when you use the term "safe" here.

It is not.

Err, it is not what? Deterministic? Or safe? The point about it being deterministic did not relate to when exactly the process would resume (no real-time guarantee) but rather that it would resume deterministically in relation to its parent process (in this case, only after the parent process got suspended).

...

Whether the low-priority process actually starts depends on external factors. If you have two priority-40 processes, they might prevent the priority-39 process to start and resume the forked process.

Correct. And it is an interesting question to discuss how the system *should* behave if it's exceeding its capabilities (i.e., running at 100% CPU). But I'll leave that discussion for a different day.

Cheers, - Andreas

John Brant

9:35 p.m.

Andreas Raab wrote:

...

Paolo Bonzini wrote:

...
...
...

especially, the fix is not 100% safe unless I'm mistaken.

What do you mean by "100% safe"? It is 100% deterministic (which is what I care about); I'm not sure what you mean when you use the term "safe" here.

It is not.

Err, it is not what? Deterministic? Or safe? The point about it being deterministic did not relate to when exactly the process would resume (no real-time guarantee) but rather that it would resume deterministically in relation to its parent process (in this case, only after the parent process got suspended).

I don't have the image in front of me, but what about when the process is running at the bottom priority?

What I'm guessing is happening in your case, is that some higher priority process is interrupting the "parent" process after it has made the forked process runnable, but before it has assigned the variable. When the "parent" process is interrupted, it is inserted at the end of runnable processes at that priority. Therefore, when the higher priority process yields control, instead of yielding control back to the "parent" process, it yields control to the forked process.

If my guess is correct and you want deterministic behavior, you could change the process scheduler code so that when a higher priority process interrupts a lower priority process, the lower priority process is inserted at the beginning of the runnable processes at that priority instead of the end.

Of course, this change would break the simple time sliced scheduler hack: [[(Delay forMilliseconds: timeslice) wait] repeat] forkAt: Processor highestPriority Also, some processes that worked before, might starve now.

John Brant

Paolo Bonzini

10:06 p.m.

...

Err, it is not what? Deterministic? Or safe? The point about it being deterministic did not relate to when exactly the process would resume (no real-time guarantee) but rather that it would resume deterministically in relation to its parent process (in this case, only after the parent process got suspended).

Not completely deterministic. The behavior may change depending on whether there are other processes or not, that run at the active process priority.

What if you instead bumped a little the priority of the active process until the forked process started running?

Paolo

Andreas Raab

11:14 p.m.

Paolo Bonzini wrote:

...

...
Err, it is not what? Deterministic? Or safe? The point about it being deterministic did not relate to when exactly the process would resume (no real-time guarantee) but rather that it would resume deterministically in relation to its parent process (in this case, only after the parent process got suspended).

Not completely deterministic. The behavior may change depending on whether there are other processes or not, that run at the active process priority.

It is entirely deterministic in such that the forked process will not be resumed until the parent process is suspended. No amount of other processes change that; even if they take 100% CPU it won't change the fact that the process will not be resumed before its parent is suspended.

...

What if you instead bumped a little the priority of the active process until the forked process started running?

And how does the parent process know that the forked process is running?

Cheers, - Andreas

Paolo Bonzini

6 Feb 6 Feb

8:09 a.m.

...

...
Not completely deterministic. The behavior may change depending on whether there are other processes or not, that run at the active process priority.

It is entirely deterministic in such that the forked process will not be resumed until the parent process is suspended. No amount of other processes change that; even if they take 100% CPU it won't change the fact that the process will not be resumed before its parent is suspended.

I'm talking about starvation, due to the forked process not entering Squeak's round-robin scheduling at the given priority. It cannot happen without your patch, and can with.

...

...
What if you instead bumped a little the priority of the active process until the forked process started running?

And how does the parent process know that the forked process is running?

The forked process could reset the priority of the parent process:

fork ^self forkAt: (Processor activePriority bitAnd: -2)

forkAt: priority Processor activePriority = priority ifTrue: [ p := Processor activeProcess. p priority: (priority bitOr: 1). forkedProcess := [p priority = (priority bitOr: 1) ifTrue: [p priority: priority]. self value] newProcess. ] ifFalse: [ forkedProcess := self newProcess ]. forkedProcess priority: priority. forkedProcess resume. ^forkedProcess

Maybe this causes a different kind of race condition though.

Paolo

Mathieu Suen

11:42 a.m.

On Feb 5, 2008, at 7:51 PM, Andreas Raab wrote:

...

Paolo Bonzini wrote:

...
...
That's part of the reason why I won't pursue these changes here. To me these changes are just as important as the ones that I posted for Delay and Semaphore. However, unless one understands the kinds of problems that are caused by the current code it is pointless to argue that fixing them is important - I'm sure that unless people had been bitten by Delay and Semaphore we would have the same kinds of debates with all sorts of well-meant advise on how you "ought" to write your code ;-)

It's not that I don't think it's important. I think the *bugs* are important to fix, but that the root cause just *cannot* be fixed.

This completely depends on your definition of "root cause" and "cannot". For me, it's the fact that fork will behave in 99.99% of the cases in one way and in 0.01% in a different way. That kind of non-determinism is probably the root cause for many lingering bugs in our system and it *can* be eliminated.

...
It's just that:

the many people who made the same mistake maybe were just

cutting'n'pasting buggy code;

That is of course a possibility but unless you think the majority of people recognized the bug in the code snippet I posted, I fail to see how this makes a difference.

...

especially, the fix is not 100% safe unless I'm mistaken.

What do you mean by "100% safe"? It is 100% deterministic (which is what I care about); I'm not sure what you mean when you use the term "safe" here.

It may be a stupid remark but I try :) Even with your fix that could be not safe if the event handler have a higher priority. Am I right?

...

Cheers,

Andreas

Mth

Michael van der Gulik

12:08 a.m.

On Feb 5, 2008 11:02 PM, Bert Freudenberg bert@freudenbergs.de wrote:

...

On Feb 5, 2008, at 10:29 , Andreas Raab wrote:

...
Paolo Bonzini wrote:

...
I'm with Terry on the correct idiom to use, i.e. workerProcess := [self runWorkerProcess] newProcess. workerProcess resume.

Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Well, you should give us a bit more than a few hours ;) Until now most posters did not even understand the proposal.

I for one would appreciate getting your fix in. It does not change the current semantics, and makes one very common idiom (var := [...] fork) safer to use. There may be better idioms, granted. However, for now Squeak's scheduling policy is beautifully deterministic, and I like keeping simple things simple.

Andreas's original code was buggy and his proposed fix was incorrect.

You should never make any assumptions about the scheduling behaviour of your environment. It can and will change in the future. Also, it makes your code less portable across dialects.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Andreas Raab

12:47 a.m.

Michael van der Gulik wrote:

...

Andreas's original code was buggy and his proposed fix was incorrect.

If I would need any more proof that people don't get what I'm trying to fix, this would do it ;-) Seriously, you *really* don't get what I'm trying to address with the fix.

Anyway, I don't care. Croquet has that fix and it's your choice to ignore it and deal with the consequences. And with that I'm out of this thread for real and apologize for the pointless waste of bandwidth.

Cheers, - Andreas

Michael van der Gulik

8:37 a.m.

On 2/6/08, Andreas Raab andreas.raab@gmx.de wrote:

...

Michael van der Gulik wrote:

...
Andreas's original code was buggy and his proposed fix was incorrect.

If I would need any more proof that people don't get what I'm trying to fix, this would do it ;-) Seriously, you *really* don't get what I'm trying to address with the fix.

Anyway, I don't care. Croquet has that fix and it's your choice to ignore it and deal with the consequences. And with that I'm out of this thread for real and apologize for the pointless waste of bandwidth.

Sorry; I think my original email was a bit blunt and direct. No personal attack was intended and I do have a lot of respect for you.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

nicolas cellier

9:43 p.m.

Andreas Raab a écrit :

...

Anyway, I don't care. Croquet has that fix and it's your choice to ignore it and deal with the consequences. And with that I'm out of this thread for real and apologize for the pointless waste of bandwidth.

Yeah, after discussing for so long licence issues, underscores, the future of this or whatever, we sure cannot waste any more time on pragmatic things.

Your change has my vote, would you be kind to open a Mantis issue?

Any interest in completing the api with opposite alternative by way of a helperProcess at priority+1? Don't know how to name it, #forkFront ?

Nicolas

...

Cheers,

Andreas

stephane ducasse

7 Feb 7 Feb

5:08 p.m.

hi andreas

I know far too well this feeling and often because of your statement about what I'm doing :) This is important to get feedback on really difficult points. So do not leave the discussion.

I'm bad in concurrent programming so normally I read thread to try to understand and fix my brain emptiness :) Stef

On Feb 6, 2008, at 12:47 AM, Andreas Raab wrote:

...

Michael van der Gulik wrote:

...
Andreas's original code was buggy and his proposed fix was incorrect.

If I would need any more proof that people don't get what I'm trying to fix, this would do it ;-) Seriously, you *really* don't get what I'm trying to address with the fix.

Anyway, I don't care. Croquet has that fix and it's your choice to ignore it and deal with the consequences. And with that I'm out of this thread for real and apologize for the pointless waste of bandwidth.

Cheers,

Andreas

Bert Freudenberg

6 Feb 6 Feb

12:09 p.m.

On Feb 6, 2008, at 0:08 , Michael van der Gulik wrote:

...

You should never make any assumptions about the scheduling behaviour of your environment.

Nonsense. Squeak *defines* the scheduling behavior. That's one, maybe even *the* major benefit over system threads.

- Bert -

Michael van der Gulik

9:26 p.m.

On 2/7/08, Bert Freudenberg bert@freudenbergs.de wrote:

...

On Feb 6, 2008, at 0:08 , Michael van der Gulik wrote:

...
You should never make any assumptions about the scheduling behaviour of your environment.

Nonsense. Squeak *defines* the scheduling behavior. That's one, maybe even *the* major benefit over system threads.

Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

nicolas cellier

9:45 p.m.

Michael van der Gulik a écrit :

...

Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

I see, sustainable development. Your writing code for our children.

...

Gulik.

Igor Stasenko

11:16 p.m.

On 06/02/2008, nicolas cellier ncellier@ifrance.com wrote:

...

Michael van der Gulik a écrit :

...
Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

I see, sustainable development. Your writing code for our children.

Oh, come on, how you can be so blind? You need a ground, based on which you can build your application. You need some API with clear rules for play. And these rules defined by shared principles and common terms, easily understandable by people. Newcomer who will start doing things should not spend time harvesting, how #fork works on particular squeak version.

So, when i say, #fork, it should fork today, and should fork tomorrow, and should fork after 20 years. Because it's a generic rule, and API should behave as generic rule dictates: do whatever it take to make two processes run in parallel. But if you start doing reverse: let API influence rules, then what you will get in result? 10 APIs and each defining own semantics of fork?

-- Best regards, Igor Stasenko AKA sig.

Michael van der Gulik

11:38 p.m.

On 2/7/08, Igor Stasenko siguctua@gmail.com wrote:

...

On 06/02/2008, nicolas cellier ncellier@ifrance.com wrote:

...
Michael van der Gulik a écrit :

...
Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

I see, sustainable development. Your writing code for our children.

Oh, come on, how you can be so blind? You need a ground, based on which you can build your application. You need some API with clear rules for play. And these rules defined by shared principles and common terms, easily understandable by people. Newcomer who will start doing things should not spend time harvesting, how #fork works on particular squeak version.

So, when i say, #fork, it should fork today, and should fork tomorrow, and should fork after 20 years. Because it's a generic rule, and API should behave as generic rule dictates: do whatever it take to make two processes run in parallel. But if you start doing reverse: let API influence rules, then what you will get in result? 10 APIs and each defining own semantics of fork?

Sig: I think that people are now just saying rubbish to provoke a reaction. I wouldn't bother replying.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Blake

7 Feb 7 Feb

12:16 a.m.

On Wed, 06 Feb 2008 14:38:43 -0800, Michael van der Gulik mikevdg@gmail.com wrote:

...

On 2/7/08, Igor Stasenko siguctua@gmail.com wrote:

...
On 06/02/2008, nicolas cellier ncellier@ifrance.com wrote:

...
Michael van der Gulik a écrit :

...
When that day comes, my code will run faster and your code will

break.

...
...
I see, sustainable development. Your writing code for our children.

Oh, come on, how you can be so blind?

Sig: I think that people are now just saying rubbish to provoke a reaction. I wouldn't bother replying.

As much as I am a fan of a solid code base and reusable code, actual history shows very little survives that long, and that which does survive is often a problem.

Programming for a not-yet-extant paradigm, environment, etc., may not be the wisest use of resources.

Not trollin', just sayin'.

===Blake===

Yoshiki Ohshima

12:52 a.m.

Michael,

...

I think that people are now just saying rubbish to provoke a reaction. I wouldn't bother replying.

Wow, does this "people" include yourself? What was this email about?

http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-February/125110....

You apologized for making a provoking comment there. Why again?

And yes, as Andreas wrote, I don't think you understand the problem Andreas was trying to solve. And yes, it was started under the assumption that the current VM behaves in the way it does. And yes, Bert has a point: we have the control over the behavior, and determinism under the assumption has a real practical merit. Please re-visit the problem domain this discussion was in.

-- Yoshiki

BTW, you apologized for making a direct comment but didn't mention that claiming Andreas' code was buggy and incorrect was wrong. Is that still so?

Michael van der Gulik

1:31 a.m.

On 2/7/08, Yoshiki Ohshima yoshiki@vpri.org wrote:

...

Michael,

...
I think that people are now just saying rubbish to provoke a reaction. I wouldn't bother replying.

Wow, does this "people" include yourself? What was this email about?

http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-February/125110....

You apologized for making a provoking comment there. Why again?

It was not intended to be a provoking comment.

...

And yes, as Andreas wrote, I don't think you understand the problem

I understand the problem and his fix.

...

BTW, you apologized for making a direct comment but didn't mention that claiming Andreas' code was buggy and incorrect was wrong. Is that still so?

Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

I'll continue this discussion after I've released a stable, multi-threaded VM in which Andreas's example breaks.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Michael van der Gulik

1:33 a.m.

On 2/7/08, Michael van der Gulik mikevdg@gmail.com wrote:

...

On 2/7/08, Yoshiki Ohshima yoshiki@vpri.org wrote:

...
Michael,

...
I think that people are now just saying rubbish to provoke a reaction. I wouldn't bother replying.

Wow, does this "people" include yourself? What was this email about?

http://lists.squeakfoundation.org/pipermail/squeak-dev/2008-February/125110....

...
You apologized for making a provoking comment there. Why again?

It was not intended to be a provoking comment.

*ahem* "a provocative comment".

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

nicolas cellier

9:43 p.m.

Michael van der Gulik a écrit :

...

Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

A feature taken for granted by many programmers as indicated by existing code base was working in 99.9%, but was not guaranteed 100%.

If this feature seems natural and its implementation does not break anything, why not change the API and make it conform to common expectations?

Andreas either added a new contract, or repaired a broken contract.

Your concerns are about - portability to other dialect - compatibility with future parallel thread

The first might not be major, other Smalltalk can be patched too (if needed).

The second cause is noble, but maybe not wise, you are putting too much constraints on code for an eventual event that might not happen any time soon. How can you foresee an API that still isn't defined? And if you can, can we? A french would call that "tirer des plans sur la comète".

...

I'll continue this discussion after I've released a stable, multi-threaded VM in which Andreas's example breaks.

Gulik.

With different API contracts and constraints a lot of code would break.

Either, there will be a major rewrite of libraries, or you will have to provide a backward compatibility for elder API.

What matters is that code is bad today, because programmer expectations do not meet Kernel implementation. And Andreas solved this nicely I think.

Maybe we are all wrong because you are able to provide a clever concurrent parallel multithread implementation 99% compatible with existing code base, even core BlockContext and Process class, maybe you are a few years ahead, but until this is proven, we'll assume Andreas is right.

Sorry, I don't like crying with the wolves, but your repeated assertions deserved a response.

Friendly.

Nicolas

Igor Stasenko

10:44 p.m.

On 07/02/2008, nicolas cellier ncellier@ifrance.com wrote:

...

Michael van der Gulik a écrit :

...
Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

A feature taken for granted by many programmers as indicated by existing code base was working in 99.9%, but was not guaranteed 100%.

If this feature seems natural and its implementation does not break anything, why not change the API and make it conform to common expectations?

Andreas either added a new contract, or repaired a broken contract.

Please, correct me if i'm wrong: Andreas proposing that given code will work

workerProcess := [ ... ] fork.

while in block, you accessing a workerProcess variable and expect it to be initialized.

Now, if that's true, then some of us might assume that he can do something like this:

MyClass>>forkAndReturnForkedProcess

^ [ .... ] fork.

and in another method write:

MyAnotherClass>>forkProcess ^ forkedProcess := myClass forkAndReturnForkedProcess.

and he should expect that this should work as well?

And then, third developer cames and writes another method , which relies on such behavior and so on.. So, where this 'determinism' chain can be stopped, where you don't have any guarantees if forked process accessing initialized/uninitialized state? How i can estimate, where to stop with such kind of determinism? Isn't it would be better to have 'determinism' in following: if you forked process, you should make sure that you have everything properly initialized before doing fork?

...

Your concerns are about

portability to other dialect

compatibility with future parallel thread

The first might not be major, other Smalltalk can be patched too (if needed).

The second cause is noble, but maybe not wise, you are putting too much constraints on code for an eventual event that might not happen any time soon. How can you foresee an API that still isn't defined? And if you can, can we? A french would call that "tirer des plans sur la comète".

...
I'll continue this discussion after I've released a stable, multi-threaded VM in which Andreas's example breaks.

Gulik.

With different API contracts and constraints a lot of code would break.

Either, there will be a major rewrite of libraries, or you will have to provide a backward compatibility for elder API.

Hmm, why we should pursue with major rewrite each time? Why don't call it #determinatedFork (or whatever) and don't touch original method?

...

What matters is that code is bad today, because programmer expectations do not meet Kernel implementation. And Andreas solved this nicely I think.

I don't agree with that. I never expected that fork should work as Andreas proposing.

...

Maybe we are all wrong because you are able to provide a clever concurrent parallel multithread implementation 99% compatible with existing code base, even core BlockContext and Process class, maybe you are a few years ahead, but until this is proven, we'll assume Andreas is right.

I don't think it's the case, where we need to break compatibility.

...

Sorry, I don't like crying with the wolves, but your repeated assertions deserved a response.

Friendly.

Nicolas

And finally, do you really think, that hiding concurrency issues from the eyes of developer helps him to write good thread-safe code? I think, it's doing exactly opposite: makes him think safe and comfortable. But then, where problem fill finally knock to his door, he will not know what to do, because API having only 99% guarantees of something that it will work, leaving 1% to whom???

-- Best regards, Igor Stasenko AKA sig.

nicolas cellier

8 Feb 8 Feb

12:14 a.m.

Igor Stasenko a écrit :

...

Please, correct me if i'm wrong: Andreas proposing that given code will work

workerProcess := [ ... ] fork.

while in block, you accessing a workerProcess variable and expect it to be initialized.

Now, if that's true, then some of us might assume that he can do something like this:

MyClass>>forkAndReturnForkedProcess

^ [ .... ] fork.

and in another method write:

MyAnotherClass>>forkProcess ^ forkedProcess := myClass forkAndReturnForkedProcess.

and he should expect that this should work as well?

How I understand it, above this code will work. Because forkedProcess won't have a chance to start until activeProcess is blocked on a Semaphore wait.

...

And then, third developer cames and writes another method , which relies on such behavior and so on.. So, where this 'determinism' chain can be stopped, where you don't have any guarantees if forked process accessing initialized/uninitialized state? How i can estimate, where to stop with such kind of determinism? Isn't it would be better to have 'determinism' in following: if you forked process, you should make sure that you have everything properly initialized before doing fork?

The simpler, the better. That's why i like Andreas solution.

...

...
With different API contracts and constraints a lot of code would break.

Either, there will be a major rewrite of libraries, or you will have to provide a backward compatibility for elder API.

Hmm, why we should pursue with major rewrite each time? Why don't call it #determinatedFork (or whatever) and don't touch original method?

A #randomFork API has no value per se. A #deterministicFork can simplify coding.

Of course, in a multi-processor VM, randomFork would be the default, and deterministicFork might still exist but be more costly (no parallelism).

...

...
What matters is that code is bad today, because programmer expectations do not meet Kernel implementation. And Andreas solved this nicely I think.

I don't agree with that. I never expected that fork should work as Andreas proposing.

Then Andreas patch won't change anything for you. It's an easy way to make some existing code work.

I perfectly understand your point. You say, we'd better change existing code.

Because Andreas implementation relies on correct sequencing based only on Process priority, it cannot be extended to multi-processor case. So at the very least, this deterministicFork would not work without a rewrite.

It would be easier to mark explicitely existing code relying on this feature, and rewrite only those parts latter...

With Andreas change, we choose the other alternative, be lazy now, and delay this hard work to review every #fork, until we have such a VM available.

...

...
Maybe we are all wrong because you are able to provide a clever concurrent parallel multithread implementation 99% compatible with existing code base, even core BlockContext and Process class, maybe you are a few years ahead, but until this is proven, we'll assume Andreas is right.

I don't think it's the case, where we need to break compatibility.

I do not believe that you will be able to provide image compatibility, simply because a lot of code has been written with single-Processor model in mind.

But maybe I'm wrong, you looked at this part of code longer than me.

...

And finally, do you really think, that hiding concurrency issues from the eyes of developer helps him to write good thread-safe code? I think, it's doing exactly opposite: makes him think safe and comfortable.

I do not like this argument. It's like saying, coding is complex, and we must keep it complex to prevent vulgus pecum to put its nose in. Not Smalltalk philosophy as I understood it.

If you say: using proposed explicit #resume solution when needed would be as simple, more portable and future-proofed, that's an argument i'm more inclined to ear.

...

But then, where problem fill finally knock to his door, he will not know what to do, because API having only 99% guarantees of something that it will work, leaving 1% to whom???

Now, that it can be 100% with 5 lines of code, I say we should afford that facility.

It all depends on the horizon of availability of a multi-processor VM with all atomicity problems resolved. I tend to not believe in it anytime soon, and i tend to not believe it would run an image without rewriting some parts. That's maybe where we most disagree because the hard work you put into it.

Cheers

Nicolas

Bert Freudenberg

9:37 a.m.

On Feb 7, 2008, at 22:44 , Igor Stasenko wrote:

...

And finally, do you really think, that hiding concurrency issues from the eyes of developer helps him to write good thread-safe code? I think, it's doing exactly opposite: makes him think safe and comfortable.

By that reasoning - how about making the image crash on each doesNotUnderstand error? That will teach them! The all purpose beginners language C does that, too, and actually goes out if its way to not be safe nor comfortable, so surely this must be good.

- Bert -

Igor Stasenko

9:59 a.m.

On 08/02/2008, Bert Freudenberg bert@freudenbergs.de wrote:

...

On Feb 7, 2008, at 22:44 , Igor Stasenko wrote:

...
And finally, do you really think, that hiding concurrency issues from the eyes of developer helps him to write good thread-safe code? I think, it's doing exactly opposite: makes him think safe and comfortable.

By that reasoning - how about making the image crash on each doesNotUnderstand error? That will teach them! The all purpose beginners language C does that, too, and actually goes out if its way to not be safe nor comfortable, so surely this must be good.

You are right. But i just wanted to show, that for developing a concurrent program, developer should 'switch' the mind, to look everywhere in his code, where he needs to be very accurate. And if you make him keep thinking that he's coding just a plain non-concurrent code, then he will never get the correct results. He should learn from very starting, that developing a concurrent program is way too different than non-concurrent and requires different approaches.

...

Bert -

-- Best regards, Igor Stasenko AKA sig.

John Brant

12:19 a.m.

nicolas cellier wrote:

...

Michael van der Gulik a écrit :

...
Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

To me, the existing behavior is deterministic. When you fork a process at the same priority as the running parent process, then the new forked process is added to the list of runnable processes at that priority. The parent process continues to run until either a higher priority process becomes runnable or it yields control. If a higher priority process becomes runnable, then the parent process is suspended and added to the end of the runnable processes at its priority. Since it is added at the end of the runnable processes list, the forked process will resume before the parent resumes.

John Brant

nicolas cellier

12:44 a.m.

John Brant a écrit :

...

nicolas cellier wrote:

...
Michael van der Gulik a écrit :

...
Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Yes it's a flaw. Easy to correct however.

...

Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

OK I like it. Nothing in Andreas implementation would guaranty any children-determinism, that's a good point.

Your example is worse because results are counter intuitive. That's a definitive argument.

So i take my vote back, and say only idiots don't change their mind. Sorry Andreas, you'd better use an explicit "newProcess and an explicit #resume as suggested.

Nicolas

...

To me, the existing behavior is deterministic. When you fork a process at the same priority as the running parent process, then the new forked process is added to the list of runnable processes at that priority. The parent process continues to run until either a higher priority process becomes runnable or it yields control. If a higher priority process becomes runnable, then the parent process is suspended and added to the end of the runnable processes at its priority. Since it is added at the end of the runnable processes list, the forked process will resume before the parent resumes.

John Brant

Yoshiki Ohshima

1:14 a.m.

(Hmm, while I'm writing this, I saw an email from nicolas. Well, what a sychronicity; he also uses the phrase "counter intuitive". But the argument is different somehow... So let me post this anyway...)

...

Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Yes, the #lowestPriority case would be a problem. The #lowestPriority would be renamed to #reallyLowestPriority and new #lowestPriority would return #reallyLowestPriority+1?

...

Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

It depends on the interpretation of "should" above. So, one of points sig and Michael made was that in the future we may have parallel-processing VM and a different scheduler. In that case, a process with higher priority would mean that it gets scheduled more often than lower one, but not necessarily it prevents a lower one from running, right? If we take that variation of scheduler into account, "nextPut: 3" may be scheduled before nextPut: 3 and it is not "incorrect". (From one point of view, a program that relies on a particular implementation of scheduling is wrong.)

...

[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

To me, the existing behavior is deterministic.

I think when Andreas said "99.99+%", I think he actually counded it. With the presense of higher priority network activity and other interrupts, he reported that original implementation wasn't deterministic, and I rather take his words. (I'm not going to try something 10,000 times by myself, though.)

The result #(1 3 2) may look "counter-intuitive", but the idea was "rather to be counter-intuitive than non-deterministic".

There was a little project to provide 2D distributed conferencing system with video, audio, slides, and text chat in Squeak a few years back. Back then, they didn't have the reliable Semaphore patch (nor Andreas in the team). Sure enough, the system never become fully stable and eventually the project died. (They rewrote a clone system in Flex and it works really well, actually.)

In the future, somebody might try to implement another kind of distributed interactive system. Whoever that person might be, he would certainly want to have this in his image...

-- Yoshiki

Michael van der Gulik

1:21 a.m.

On Feb 8, 2008 1:14 PM, Yoshiki Ohshima yoshiki@vpri.org wrote:

...

(Hmm, while I'm writing this, I saw an email from nicolas. Well, what a sychronicity; he also uses the phrase "counter intuitive". But the argument is different somehow... So let me post this anyway...)

...
Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Yes, the #lowestPriority case would be a problem. The #lowestPriority would be renamed to #reallyLowestPriority and new #lowestPriority would return #reallyLowestPriority+1?

Would you seriously consider changing this in the squeak.org image?

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Yoshiki Ohshima

1:32 a.m.

...

 Yes, the #lowestPriority case would be a problem.  The
#lowestPriority would be renamed to #reallyLowestPriority and new
#lowestPriority would return #reallyLowestPriority+1?

Would you seriously consider changing this in the squeak.org image?

I don't know the right answer to this question. I understand that most of people don't need it and then don't want to have it. OTOH, I don't mind to have it and that would prevent a few people in the future stumble on the problem...

-- Yoshiki

Michael van der Gulik

2:11 a.m.

On Feb 8, 2008 1:32 PM, Yoshiki Ohshima yoshiki@vpri.org wrote:

...

...
 Yes, the #lowestPriority case would be a problem.  The
#lowestPriority would be renamed to #reallyLowestPriority and new
#lowestPriority would return #reallyLowestPriority+1?
Would you seriously consider changing this in the squeak.org image?
I don't know the right answer to this question. I understand that most of people don't need it and then don't want to have it. OTOH, I don't mind to have it and that would prevent a few people in the future stumble on the problem...

In that case, the lowest priority idle thread needs to be of a lower priority than #reallyLowestPriority, so I propose also adding #actualReallyLowestPriority to be the lowest priority, then #reallyLowestPriority to be #actualReallyLowestPriority+1 and #lowestPriority to be #actualReallyLowestPriority+2.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Yoshiki Ohshima

5:17 a.m.

At Fri, 8 Feb 2008 14:11:10 +1300, Michael van der Gulik wrote:

...

On Feb 8, 2008 1:32 PM, Yoshiki Ohshima yoshiki@vpri.org wrote:
>      Yes, the #lowestPriority case would be a problem.  The
>     #lowestPriority would be renamed to #reallyLowestPriority and new
>     #lowestPriority would return #reallyLowestPriority+1?
>
> Would you seriously consider changing this in the squeak.org image?

 I don't know the right answer to this question.  I understand that
most of people don't need it and then don't want to have it.  OTOH, I
don't mind to have it and that would prevent a few people in the
future stumble on the problem...
In that case, the lowest priority idle thread needs to be of a lower priority than #reallyLowestPriority, so I propose also adding #actualReallyLowestPriority to be the lowest priority, then #reallyLowestPriority to be # actualReallyLowestPriority+1 and #lowestPriority to be #actualReallyLowestPriority+2.

Ah, so by saying *this*, you're referring to the method name, not the idea. Sorry for misunderstanding.

No, we don't have to have #reallyLowestPriority method, sure.

-- Yoshiki

Igor Stasenko

5:42 a.m.

On 08/02/2008, Michael van der Gulik mikevdg@gmail.com wrote:

...

On Feb 8, 2008 1:32 PM, Yoshiki Ohshima yoshiki@vpri.org wrote:

...
...
 Yes, the #lowestPriority case would be a problem.  The
#lowestPriority would be renamed to #reallyLowestPriority and new
#lowestPriority would return #reallyLowestPriority+1?
Would you seriously consider changing this in the squeak.org image?
I don't know the right answer to this question. I understand that most of people don't need it and then don't want to have it. OTOH, I don't mind to have it and that would prevent a few people in the future stumble on the problem...
In that case, the lowest priority idle thread needs to be of a lower priority than #reallyLowestPriority, so I propose also adding #actualReallyLowestPriority to be the lowest priority, then #reallyLowestPriority to be #actualReallyLowestPriority+1 and #lowestPriority to be #actualReallyLowestPriority+2.

I'd like to run this code:

block := [:counter | counter > 0 ifTrue: [ [ block value: counter -1 ] forkAt: (Processor currentPriority -1 ) ] ].

block value:1000.

In result it should spawn 1000 processes climbing down priority levels.

Just want to note, that if we using relative priority values , then we should care that given example will work, and get rid of the absolute priority values. Processor lowestPriority then should mean the process which running with lowest priority. And so, you can schedule new process with even lower priority, if you like so. Same for the #highestPriority - it should be a highest known value of priority which used by running processes.

...

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

-- Best regards, Igor Stasenko AKA sig.

nicolas cellier

2:12 a.m.

Yoshiki Ohshima a écrit :

...

(Hmm, while I'm writing this, I saw an email from nicolas. Well, what a sychronicity; he also uses the phrase "counter intuitive". But the argument is different somehow... So let me post this anyway...)

...
Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Yes, the #lowestPriority case would be a problem. The #lowestPriority would be renamed to #reallyLowestPriority and new #lowestPriority would return #reallyLowestPriority+1?

Hmm i'd rather thought do a #randomFork in this case.

...

...
Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

It depends on the interpretation of "should" above. So, one of points sig and Michael made was that in the future we may have parallel-processing VM and a different scheduler. In that case, a process with higher priority would mean that it gets scheduled more often than lower one, but not necessarily it prevents a lower one from running, right? If we take that variation of scheduler into account, "nextPut: 3" may be scheduled before nextPut: 3 and it is not "incorrect". (From one point of view, a program that relies on a particular implementation of scheduling is wrong.)

Yes of course, no assumption based on priority possible in that case.

Some expectations are made on priority in current image. Especially, highestPriority means unpreemptively. But unpreemptively also means exclusively in single Processor. Not portable to multi-processor.

...

...
[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

To me, the existing behavior is deterministic.

I think when Andreas said "99.99+%", I think he actually counded it. With the presense of higher priority network activity and other interrupts, he reported that original implementation wasn't deterministic, and I rather take his words. (I'm not going to try something 10,000 times by myself, though.)

The result #(1 3 2) may look "counter-intuitive", but the idea was "rather to be counter-intuitive than non-deterministic".

Yes, Andreas contract was only between parent and child, and is therefore not broken by this example.

Current implementation is such that 3 is always the last one. Andreas implementation is such that 1 is always the first one.

Order of others two is not deterministic (the 0.01% cases).

This example is just a trick, showing how expectations based on priority can be confusing ("preuve par l'absurde").

It's putting in code what Igor was trying to tell: it might be better to not expect anything and rely for example on a good and simple #newProcess #resume pair which perfectly fits Andreas need.

That's why I'm not sure at all his patch is needed now. It's a clever brilliant patch. But enforcing some expectations that are arguable. I better understand counter arguments. Only my opinion...

Nicolas

...

There was a little project to provide 2D distributed conferencing system with video, audio, slides, and text chat in Squeak a few years back. Back then, they didn't have the reliable Semaphore patch (nor Andreas in the team). Sure enough, the system never become fully stable and eventually the project died. (They rewrote a clone system in Flex and it works really well, actually.)

In the future, somebody might try to implement another kind of distributed interactive system. Whoever that person might be, he would certainly want to have this in his image...

-- Yoshiki

John Brant

2:51 a.m.

Yoshiki Ohshima wrote:

...

...
Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

It depends on the interpretation of "should" above. So, one of points sig and Michael made was that in the future we may have parallel-processing VM and a different scheduler. In that case, a process with higher priority would mean that it gets scheduled more often than lower one, but not necessarily it prevents a lower one from running, right? If we take that variation of scheduler into account, "nextPut: 3" may be scheduled before nextPut: 3 and it is not "incorrect". (From one point of view, a program that relies on a particular implementation of scheduling is wrong.)

If we consider a parallel-processing VM, then Andreas' patch won't come close to working, since it won't guarantee that the forked process isn't started too early. Therefore, I wasn't considering a parallel processing scheduler. However, I think my statement is true for the current schedulers in VW, VA, Dolphin, and Squeak. All of them have a variation of a scheduler that has only a single active process and implements this rule: active process' priority >= all runnable processes' priorities

John Brant

Yoshiki Ohshima

7:40 a.m.

John,

...

If we consider a parallel-processing VM, then Andreas' patch won't come close to working, since it won't guarantee that the forked process isn't started too early. Therefore, I wasn't considering a parallel processing scheduler.

Neither the program that assumes to get #(1 2 3) from your example code. So, these are "par", roughly speaking.

And, the patch makes the behavior easier to reason if we take current implementation. So there is some value.

-- Yoshiki

Igor Stasenko

5:51 a.m.

On 08/02/2008, Yoshiki Ohshima yoshiki@vpri.org wrote:

I'd like to emphasize this (please, increase the font size, when you reading following phrase):

... a program that relies on a particular implementation of scheduling is wrong.

...

-- Yoshiki

-- Best regards, Igor Stasenko AKA sig.

Andreas Raab

7:17 a.m.

Igor Stasenko wrote:

...

I'd like to emphasize this (please, increase the font size, when you reading following phrase):

... a program that relies on a particular implementation of scheduling is wrong.

Sorry but that's complete nonsense. If you really believe this then I'll challenge you to write a program that you believe is "correct" for the following problem: Create a program that will write the numbers 1 through five from five concurrent threads into an array in the order 1, 2, 3, 4, 5. Something that today you could do for example via:

result := Array new: 5. 1 to: 5 do:[:i| [result at: i put: i] forkAt: Processor activePriority+1. ].

*Regardless* of your implementation I will define a scheduling mechanism that breaks your program, e.g., make it behave "wrong". In other words if you believe what you write above you won't *ever* be able to write correct multi-threaded code.

Cheers, - Andreas

Igor Stasenko

9:40 a.m.

On 08/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...

Igor Stasenko wrote:

...
I'd like to emphasize this (please, increase the font size, when you reading following phrase):

... a program that relies on a particular implementation of scheduling is wrong.

Sorry but that's complete nonsense. If you really believe this then I'll challenge you to write a program that you believe is "correct" for the following problem: Create a program that will write the numbers 1 through five from five concurrent threads into an array in the order 1, 2, 3, 4, 5. Something that today you could do for example via:

result := Array new: 5. 1 to: 5 do:[:i| [result at: i put: i] forkAt: Processor activePriority+1. ].

I'll write only a block code, if you don't mind:

[ i == 1 ifTrue:[ result at: i put: i] ifFalse: [ [ (result at: i-1) isNil ] whileTrue: []. result at: i put: i ] ]

.. this will work with assumption that each block keeps own, separate copy of i in it's own context. Otherwise there is need to add code to ensure this.

...

*Regardless* of your implementation I will define a scheduling mechanism that breaks your program, e.g., make it behave "wrong". In other words if you believe what you write above you won't *ever* be able to write correct multi-threaded code.

Unless you define a scheduling mechanism, which performs program actions in random order (not in sequence, which developer defined).

Now answer me, please, is such kind of tasks (where you need to perform action only after previous is complete) worth using concurrent threads of execution? This is apparently the case, when you using wrong tool to solve the problem. So, it's not really matter if my code will work, or will not, it's simply wrong.

...

Cheers,

Andreas

-- Best regards, Igor Stasenko AKA sig.

Andreas Raab

10:23 a.m.

Igor Stasenko wrote:

...

I'll write only a block code, if you don't mind:

Sure.

...

[ i == 1 ifTrue:[ result at: i put: i] ifFalse: [ [ (result at: i-1) isNil ] whileTrue: []. result at: i put: i ] ]

.. this will work with assumption that each block keeps own, separate copy of i in it's own context. Otherwise there is need to add code to ensure this.

This doesn't even come close. You can easily see this simply by assuming that a theoretical scheduler schedules later threads with a slightly higher priority and runs them like todays scheduler. So it would be roughly equivalent to the following:

1 to: 5 do:[:i| [ i == 1 ifTrue:[ result at: i put: i] ifFalse: [ [ (result at: i-1) isNil ] whileTrue: []. result at: i put: i ] ] forkAt: Processor activePriority - 10 + i. ].

This simply won't do. Try again please. (BTW, for clarification, I expect that when the "program" completes that the result array is filled in).

...

...
*Regardless* of your implementation I will define a scheduling mechanism that breaks your program, e.g., make it behave "wrong". In other words if you believe what you write above you won't *ever* be able to write correct multi-threaded code.

Unless you define a scheduling mechanism, which performs program actions in random order (not in sequence, which developer defined).

Now answer me, please, is such kind of tasks (where you need to perform action only after previous is complete) worth using concurrent threads of execution?

It's not. But that's not my point. You claimed you can write "correct" code without assuming anything about scheduling. I'm simply disproving you. I'm willing to use any other non-trivial example you can come up with (with "trivial" being defined as non-interacting processes). I am only using the simplest set of interacting processes I could possibly think of - a sequence.

Cheers, - Andreas

Igor Stasenko

10:36 a.m.

On 08/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...

Igor Stasenko wrote:

...
I'll write only a block code, if you don't mind:

Sure.

...
[ i == 1 ifTrue:[ result at: i put: i] ifFalse: [ [ (result at: i-1) isNil ] whileTrue: []. result at: i put: i ] ]

.. this will work with assumption that each block keeps own, separate copy of i in it's own context. Otherwise there is need to add code to ensure this.

This doesn't even come close. You can easily see this simply by assuming that a theoretical scheduler schedules later threads with a slightly higher priority and runs them like todays scheduler. So it would be roughly equivalent to the following:

1 to: 5 do:[:i| [ i == 1 ifTrue:[ result at: i put: i] ifFalse: [ [ (result at: i-1) isNil ] whileTrue: []. result at: i put: i ] ] forkAt: Processor activePriority - 10 + i. ].

This simply won't do. Try again please. (BTW, for clarification, I expect that when the "program" completes that the result array is filled in).

Well, i expect that your potential scheduler gives a chance to run processes at _any_ priority (doing something sometimes to switch active processes). Otherwise we're in not the parallel domain anymore. If you want , at some point to get all results filled, you can use Semaphore, or plain counter (just make sure you get in to ABA mess with it):

sema := Semaphore new. 1 to: 5 do:[:i| [ i == 1 ifFalse: [ [ (result at: i-1) isNil ] whileTrue: [] ]. result at: i put: i. sema signal ] forkAt: Processor activePriority - 10 + i. ].

[ sema excessSignals < 5 ] whileTrue: []. <here you got your array filled>

...

...
...
*Regardless* of your implementation I will define a scheduling mechanism that breaks your program, e.g., make it behave "wrong". In other words if you believe what you write above you won't *ever* be able to write correct multi-threaded code.

Unless you define a scheduling mechanism, which performs program actions in random order (not in sequence, which developer defined).

Now answer me, please, is such kind of tasks (where you need to perform action only after previous is complete) worth using concurrent threads of execution?

It's not. But that's not my point. You claimed you can write "correct" code without assuming anything about scheduling. I'm simply disproving you. I'm willing to use any other non-trivial example you can come up with (with "trivial" being defined as non-interacting processes). I am only using the simplest set of interacting processes I could possibly think of - a sequence.

The only assumption about scheduling, that it allows processes to run in parallel. E.g., if you have N processes and K seconds of time, there is some K, when each process will have a chance to perform at least single action.

...

Cheers,

Andreas

-- Best regards, Igor Stasenko AKA sig.

Andreas Raab

10:51 a.m.

Igor Stasenko wrote:

...

Well, i expect that your potential scheduler gives a chance to run processes at _any_ priority (doing something sometimes to switch active processes). Otherwise we're in not the parallel domain anymore.

Ah. Now we're starting with assumptions ;-)

...

If you want , at some point to get all results filled, you can use Semaphore, or plain counter (just make sure you get in to ABA mess with it):

sema := Semaphore new. 1 to: 5 do:[:i| [ i == 1 ifFalse: [ [ (result at: i-1) isNil ] whileTrue: [] ]. result at: i put: i. sema signal ] forkAt: Processor activePriority - 10 + i. ].

[ sema excessSignals < 5 ] whileTrue: [].

<here you got your array filled>

That's a little better. Still not good enough though. Now let's assume that my theoretical scheduler has limits for how many threads it can have scheduled at the same time (because of memory limits, cores etc). Not unusal at all. Would you like to try again?

Cheers, - Andreas

Igor Stasenko

11:10 a.m.

On 08/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...

Igor Stasenko wrote:

...
Well, i expect that your potential scheduler gives a chance to run processes at _any_ priority (doing something sometimes to switch active processes). Otherwise we're in not the parallel domain anymore.

Ah. Now we're starting with assumptions ;-)

Not really, as i said, if you claim that your scheduler able to run two (or more) parallel processes, it should be true. Otherwise, i can't call it a scheduler of parallel processes :)

...

...
If you want , at some point to get all results filled, you can use Semaphore, or plain counter (just make sure you get in to ABA mess with it):

sema := Semaphore new. 1 to: 5 do:[:i| [ i == 1 ifFalse: [ [ (result at: i-1) isNil ] whileTrue: [] ]. result at: i put: i. sema signal ] forkAt: Processor activePriority - 10 + i. ].

[ sema excessSignals < 5 ] whileTrue: [].

<here you got your array filled>

That's a little better. Still not good enough though. Now let's assume that my theoretical scheduler has limits for how many threads it can have scheduled at the same time (because of memory limits, cores etc). Not unusal at all. Would you like to try again?

Hmm, do you want me to change the code because scheduler can't handle problem itself? That is exactly what i'm against: there should be API, which gives me a feeling that my tasks are running in parallel, no matter what it's doing inside. My code should not depend from it: my task is to write concurrent program, a scheduler task is to make it possible :) Isn't low level implementation serves to reach higher level of abstraction, allowing programmer to write code without deep knowledge of what is happening at low levels?

...

Cheers,

Andreas

-- Best regards, Igor Stasenko AKA sig.

Michael van der Gulik

10:38 a.m.

On Feb 8, 2008 7:17 PM, Andreas Raab andreas.raab@gmx.de wrote:

...

Igor Stasenko wrote:

...
I'd like to emphasize this (please, increase the font size, when you reading following phrase):

... a program that relies on a particular implementation of scheduling is wrong.

Sorry but that's complete nonsense. If you really believe this then I'll challenge you to write a program that you believe is "correct" for the following problem: Create a program that will write the numbers 1 through five from five concurrent threads into an array in the order 1, 2, 3, 4, 5. Something that today you could do for example via:

result := Array new: 5. 1 to: 5 do:[:i| [result at: i put: i] forkAt: Processor activePriority+1. ].

*Regardless* of your implementation I will define a scheduling mechanism that breaks your program, e.g., make it behave "wrong". In other words if you believe what you write above you won't *ever* be able to write correct multi-threaded code.

Complete with Squeak-isms:

n := 5. result := Array new: n. semaphores := Array new: n. 1 to: n do: [ :each | semaphores at: each put: Semaphore new ].

1 to: n do: [ :each | [ (semaphores at: each) wait. result at: each put: each. (each >= semaphores size) ifFalse: [ (semaphores at: each+1) signal. ] ] fixTemps fork. ]. (semaphores at: 1) signal.

I'm mostly sure it's correct, but as with most concurrent code there's probably a bug there somewhere. It's an algorithm that's unable to take advantage of any concurrency because you make the requirement that items get added sequentially.

If items could be added in any order, then the following should work:

n := 15. result := Array new: n.

1 to: n do: [ :each | [ result at: each put: each. "Should be an atomic operation" ] fixTemps fork. ].

Of course, even on a multi-CPU system, the following would execute faster because of the overhead of making Processes:

result := (1 to: 5) asArray.

This next version should fill the array, efficiently using the hypothetical CPU cores available:

go | anArray numProcesses step | anArray := Array new: 60000. "Assume an even larger number for realistic examples" numProcesses := 32. step := anArray size // numProcesses. 1 to: anArray size by: step do: [ :i | [ i to: (i+numProcesses-1) do: [ :j | anArray at: j put: j ] ] fixTemps fork. ]. ^ anArray.

However, it doesn't work except for small numbers. I'd be happy if somebody would be able to provide a fixed version; I can't work it out.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Michael van der Gulik

11:16 a.m.

On Feb 8, 2008 10:38 PM, Michael van der Gulik mikevdg@gmail.com wrote:

...

go | anArray numProcesses step | anArray := Array new: 60000. "Assume an even larger number for realistic examples" numProcesses := 32. step := anArray size // numProcesses. 1 to: anArray size by: step do: [ :i | [ i to: (i+numProcesses-1) do: [ :j | anArray at: j put: j ] ] fixTemps fork. ]. ^ anArray.

However, it doesn't work except for small numbers. I'd be happy if somebody would be able to provide a fixed version; I can't work it out.

That was a really obvious, stupid bug. Try again:

anArray := Array new: 3200000. numProcesses := 64. step := anArray size // numProcesses. 1 to: anArray size by: step do: [ :i | [ i to: i+step-1 do: [ :j | anArray at: j put: j ] ] fixTemps fork. ].

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Igor Stasenko

11:24 a.m.

On 08/02/2008, Michael van der Gulik mikevdg@gmail.com wrote:

...

On Feb 8, 2008 10:38 PM, Michael van der Gulik mikevdg@gmail.com wrote:

...
go | anArray numProcesses step | anArray := Array new: 60000. "Assume an even larger number for

realistic examples"

...
numProcesses := 32.
step := anArray size // numProcesses.
1 to: anArray size by: step do: [ :i |
    [
        i to: (i+numProcesses-1) do: [ :j |
            anArray at: j put: j ]
    ] fixTemps fork.
].
^ anArray.
However, it doesn't work except for small numbers. I'd be happy if
somebody would be able to provide a fixed version; I can't work it out.

...
That was a really obvious, stupid bug. Try again:
anArray := Array new: 3200000.
numProcesses := 64.
step := anArray size // numProcesses.
 1 to: anArray size by: step do: [ :i |
    [
        i to: i+step-1 do: [ :j |
            anArray at: j put: j ]
    ] fixTemps fork.
].
Gulik.

2 Andreas: please note, that numerous Mike's code having many assertions (because of different concurrency issues), but not a single assumption about how scheduler is handling things :)

...

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

-- Best regards, Igor Stasenko AKA sig.

Igor Stasenko

11:58 a.m.

And some notes about priority. I don't know much details how current squeak scheduler arranging things, i'm just want to point out, that you using relative priority in your samples and in your proposed patch, by writing Processor activePriority +/- n. This, in own turn uncovers another problem with current scheduler implementation: it's not designed to use relative priorities in mind. As people pointed out: Processor highestPriority + 1 or: Processor lowestPriority -1 makes no sense, and lead to errors or unpredicted behavior. So, first, i think, that if you really want to use relative priority values, you should make changes to current implementation do to it well and without errors.

Now, let's say if you make thing in the way, as i proposed in post before (use highest/lowest priorities to depend only from current set of running processes), the only thing which is left is to introduce a scale. I proposing a logarithmic time slicing: a process with priority N doing roughly twice as much operations as process with priority N-1 for a given period of time.

-- Best regards, Igor Stasenko AKA sig.

Jason Johnson

15 Mar 15 Mar

6:52 p.m.

New subject: #fork and deterministic resumption of the resulting process

On Fri, Feb 8, 2008 at 11:58 AM, Igor Stasenko siguctua@gmail.com wrote:

...

And some notes about priority. I don't know much details how current squeak scheduler arranging things, i'm just want to point out, that you using relative priority in your samples and in your proposed patch, by writing Processor activePriority +/- n. This, in own turn uncovers another problem with current scheduler implementation: it's not designed to use relative priorities in mind. As people pointed out: Processor highestPriority + 1 or: Processor lowestPriority -1 makes no sense, and lead to errors or unpredicted behavior. So, first, i think, that if you really want to use relative priority values, you should make changes to current implementation do to it well and without errors.

Now, let's say if you make thing in the way, as i proposed in post before (use highest/lowest priorities to depend only from current set of running processes), the only thing which is left is to introduce a scale. I proposing a logarithmic time slicing: a process with priority N doing roughly twice as much operations as process with priority N-1 for a given period of time.

Or you could go with what I proposed quite some time back and haven't had time to do anything about: a real event driven preemptive scheduler like found on the more advanced unix systems (i.e. don't think Linux has this yet).

Quick summary:

A lot more priorities then now (the first Sun implementation had 160, counting realtime). Every priority level has a timeslice, the higher the priority the smaller the timeslice. If a process manages to use it's whole timeslice then it gets demoted. If it doesn't it gets promoted. This type of scheduler lets interactive processes always get the CPU, and also gives CPU bound processes bigger time slices to work with.

Yoshiki Ohshima

8 Feb 8 Feb

7:50 a.m.

...

I'd like to emphasize this (please, increase the font size, when you reading following phrase):

... a program that relies on a particular implementation of scheduling is wrong.

There is still some mismatch.

Sure, at one level, you can say that if you want to ensure some ordering in a concurrent program, use a proper concurrency control mechanism; don't rely on the implementation detail.

But at another level, you can say that if you write your own scheduler, use the full-knowledge of it. There scheduler is just another module of your system. If using that knowledge makes the product rock-solid, there is nothing wrong with it.

And, with Andreas patch, normal programmer doesn't have to assume the scheduling ordering. His patch doesn't prevent people from using a proper concurrency mechanism.

In general, I agree that your statement above is a good principle, but in current Squeak context, there is another path, too.

-- Yoshiki

Mathieu Suen

1:25 a.m.

On Feb 8, 2008, at 12:19 AM, John Brant wrote:

...

[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

I am confused why the first fork run before the second fork with the Andreas patch?

Mth

John Brant

2:20 a.m.

Mathieu Suen wrote:

...

On Feb 8, 2008, at 12:19 AM, John Brant wrote:

...
[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

I am confused why the first fork run before the second fork with the Andreas patch?

Without the patch, we get the following steps: (I've labeled the blocks based on what they output to the shared queue 1, 2, or 3) A) the 1 block is evaluated at priority lowestPriority + 1 B) the 3 block is forked at priority lowestPriority C) the 1 block continues to run since it is at a higher priority than the 3 block and it outputs 1 on the queue D) the 2 block is forked at the same priority as the 1 block and placed on the runnable process list for lowestPriority + 1 E) the 1 block continues to run until it is either interrupted by a higher priority process or it hits the Delay>>wait F) the 2 block is evaluated since it has a higher priority than the 3 block; this outputs 2 on the queue and terminates G) if the 1 block was interrupted before getting to Delay>>wait, it resumes and runs to the Delay>>wait H) the 3 block is evaluated since 2 has terminated and 1 is waiting on the Delay -- it outputs 3 to the queue and terminates I) finally 1 resumes after the delay, and opens an inspector on the result

With the patch we get (first three steps are the same): A) the 1 block is evaluated at priority lowestPriority + 1 B) the 3 block is forked at priority lowestPriority C) the 1 block continues to run since it is at a higher priority than the 3 block and it outputs 1 on the queue D) a new process for the 2 block is created, but not marked as runnable; another process (2's helper) is forked at priority lowestPriority that will mark 2 as runnable when it runs E) the 1 block continues to run until it hits the Delay>>wait F) at this point we have the 3 block and 2's helper process waiting at priority lowestPriority, and 3 is first in the list, so the 3 block evaluates putting 3 on the queue and then terminates G) 2's helper process runs and resumes the 2 block H) the 2 block runs and outputs 2 on the queue, then it terminates I) 2's helper terminates J) finally 1 resumes after the delay, and opens an inspector on the result

BTW, we could get the correct #(1 2 3), if a higher priority process interrupted step F before the 3 was placed on the queue.

John Brant

Mathieu Suen

2:55 a.m.

On Feb 8, 2008, at 2:20 AM, John Brant wrote:

...

Mathieu Suen wrote:

...
On Feb 8, 2008, at 12:19 AM, John Brant wrote:

...
[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

I am confused why the first fork run before the second fork with the Andreas patch?

Without the patch, we get the following steps: (I've labeled the blocks based on what they output to the shared queue 1, 2, or 3) A) the 1 block is evaluated at priority lowestPriority + 1 B) the 3 block is forked at priority lowestPriority C) the 1 block continues to run since it is at a higher priority than the 3 block and it outputs 1 on the queue D) the 2 block is forked at the same priority as the 1 block and placed on the runnable process list for lowestPriority + 1 E) the 1 block continues to run until it is either interrupted by a higher priority process or it hits the Delay>>wait F) the 2 block is evaluated since it has a higher priority than the 3 block; this outputs 2 on the queue and terminates G) if the 1 block was interrupted before getting to Delay>>wait, it resumes and runs to the Delay>>wait H) the 3 block is evaluated since 2 has terminated and 1 is waiting on the Delay -- it outputs 3 to the queue and terminates I) finally 1 resumes after the delay, and opens an inspector on the result

With the patch we get (first three steps are the same): A) the 1 block is evaluated at priority lowestPriority + 1 B) the 3 block is forked at priority lowestPriority C) the 1 block continues to run since it is at a higher priority than the 3 block and it outputs 1 on the queue D) a new process for the 2 block is created, but not marked as runnable; another process (2's helper) is forked at priority lowestPriority that will mark 2 as runnable when it runs E) the 1 block continues to run until it hits the Delay>>wait F) at this point we have the 3 block and 2's helper process waiting at priority lowestPriority, and 3 is first in the list, so the 3 block evaluates putting 3 on the queue and then terminates G) 2's helper process runs and resumes the 2 block H) the 2 block runs and outputs 2 on the queue, then it terminates I) 2's helper terminates J) finally 1 resumes after the delay, and opens an inspector on the result

BTW, we could get the correct #(1 2 3), if a higher priority process interrupted step F before the 3 was placed on the queue.

Thanks I was stuck on the fact the #forkAt: have also the patch. So I was reasoning with the idea that #forkAt: fork a helper thread.

...

John Brant

Mth

nicolas cellier

2:24 a.m.

Mathieu Suen a écrit :

...

On Feb 8, 2008, at 12:19 AM, John Brant wrote:

...
[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

I am confused why the first fork run before the second fork with the Andreas patch?
Mth

Because of the helperProcess trick. Draw a stack of process (FIFO) for each priority.

(next put: 3) is added on FIFO of priority 0. (nextPut: 2) is at priority 1, but it is not added immediately. The helpProcess that launch it is added at priority 0, therefore after (nextPut: 3).

So 1 is added to the list, then 3 (unless interrupted by higher priority - 0.01%), then helperProcess does start (nextPut: 2), and 2 is finally added...

Nicolas

Andreas Raab

2:59 a.m.

Hi John -

If you consider these points as real concerns, both issues are easily addressed. I'm just not going to waste time on something that (given the way this thread is going) may just be a rhetorical response ;-)

Cheers, - Andreas

John Brant wrote:

...

nicolas cellier wrote:

...
Michael van der Gulik a écrit :

...
Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

Unless I've missed some correction to the patch, the patch isn't correct. For example, you'll get an "Invalid priority" walkback if you try to evaluate the following using the patch: [[Transcript show: 'works'; cr; flush] fork] forkAt: Processor lowestPriority

Furthermore, it allows lower priority processes to run before a process forked at a higher priority. For example, here's an example, that should open an inspector on #(1 2 3), but with the patch, it opens on inspector on #(1 3 2). The process forked at the lowest priority is run before the one forked at a higher priority.

[| queue | queue := SharedQueue new. [queue nextPut: 3] forkAt: Processor lowestPriority. queue nextPut: 1. [queue nextPut: 2] fork. (Delay forSeconds: 1) wait. "Hack to let the blocks finish" (Array with: queue next with: queue next with: queue next) inspect] forkAt: Processor lowestPriority + 1

To me, the existing behavior is deterministic. When you fork a process at the same priority as the running parent process, then the new forked process is added to the list of runnable processes at that priority. The parent process continues to run until either a higher priority process becomes runnable or it yields control. If a higher priority process becomes runnable, then the parent process is suspended and added to the end of the runnable processes at its priority. Since it is added at the end of the runnable processes list, the forked process will resume before the parent resumes.

John Brant

Paolo Bonzini

10:29 a.m.

...

...
...
Andreas is a great programmer, but the example he posted had a bug in it and the proposed fix was incorrect.

The patch is correct in its Squeak context.

Unless I've missed some correction to the patch, the patch isn't correct.

Here is code for my issue instead.

| queue stop s | queue := SharedQueue new. stop := false. s := Semaphore new. [ s signal. [ stop ] whileFalse: [ queue nextPut: true. Processor yield ] ] fork. s wait. [ (Delay forMilliseconds: 500) wait. stop := true ] fork. [ stop ] whileFalse: [ queue nextPut: false. Processor yield ].

Without the patch *and with any scheduler that executes same-priority processes fairly* the program is ensured to finish. With the patch, the program might not finish. The two producer processes might ping-pong control to each other, and the delay won't even be started.

Paolo

merlyn＠stonehenge.com

4:45 p.m.

...

...
...
...
...
"Paolo" == Paolo Bonzini bonzini@gnu.org writes:

Paolo> Without the patch *and with any scheduler that executes same-priority Paolo> processes fairly* the program is ensured to finish. With the patch, the Paolo> program might not finish. The two producer processes might ping-pong control Paolo> to each other, and the delay won't even be started.

A workaround that handles badly written beginner code but eventually ruins the semantics for properly written expert code doesn't sound very good. It's like ensuring that every motorcycle sold automatically comes with training wheels. Ouch.

Paolo Bonzini

5:01 p.m.

...

Paolo> Without the patch *and with any scheduler that executes same-priority Paolo> processes fairly* the program is ensured to finish. With the patch, the Paolo> program might not finish. The two producer processes might ping-pong control Paolo> to each other, and the delay won't even be started.

A workaround that handles badly written beginner code but eventually ruins the semantics for properly written expert code doesn't sound very good. It's like ensuring that every motorcycle sold automatically comes with training wheels. Ouch.

I wasn't saying that my code is practical or well written -- it's just that I can prove it finishes. But indeed your motorcycle metaphor is what I was trying to say.

Paolo

Yoshiki Ohshima

6:36 p.m.

At Fri, 08 Feb 2008 17:01:50 +0100, Paolo Bonzini wrote:

...

...
Paolo> Without the patch *and with any scheduler that executes same-priority Paolo> processes fairly* the program is ensured to finish. With the patch, the Paolo> program might not finish. The two producer processes might ping-pong control Paolo> to each other, and the delay won't even be started.

A workaround that handles badly written beginner code but eventually ruins the semantics for properly written expert code doesn't sound very good. It's like ensuring that every motorcycle sold automatically comes with training wheels. Ouch.

I wasn't saying that my code is practical or well written -- it's just that I can prove it finishes. But indeed your motorcycle metaphor is what I was trying to say.

If I'd use the analogy, what was suggested wasn't training wheels. For casual, novice programmers, proposed change is basically invisible. Sure, you can write some bad code that may starve one process but that may happen without the patch if you write code in that way, though.

The patch is useful when one trys to build a serious distributed interactive system for example; in that case, the programmer should know how to write good code so the system doesn't have to try to provide "safety" as much. One can always shoot his foot anyway. But, there, having real determinism would help so much to debug it and achieve stable application.

-- Yoshiki

Paolo Bonzini

9 Feb 9 Feb

2:26 p.m.

...

The patch is useful when one trys to build a serious distributed interactive system for example; in that case, the programmer should know how to write good code so the system doesn't have to try to provide "safety" as much.

The patch is actually *useless* if the programmer knows how to write good parallel code. Hence the analogy with training wheels.

Paolo

Yoshiki Ohshima

10 Feb 10 Feb

5:39 a.m.

At Sat, 09 Feb 2008 14:26:06 +0100, Paolo Bonzini wrote:

...

...
The patch is useful when one trys to build a serious distributed interactive system for example; in that case, the programmer should know how to write good code so the system doesn't have to try to provide "safety" as much.

The patch is actually *useless* if the programmer knows how to write good parallel code. Hence the analogy with training wheels.

Well, Andreas said that he uses the patch in a system and seems to find good use of it. So, the claim that it is useless contradicts with the reality. (Or, he doesn't know how to write good parallel code.)

Or, a good programmer should be able to devise his own parallel program idioms so that Andreas' version doesn't have to be in the public image. (Probably this is closer what you mean.)

Anyway, this is already in the domain of "rhetroic"...

-- Yoshiki

Paolo Bonzini

5:19 p.m.

...

Well, Andreas said that he uses the patch in a system and seems to find good use of it. So, the claim that it is useless contradicts with the reality. (Or, he doesn't know how to write good parallel code.)

AFAIU, he has to deal with existing buggy code and this patch works around the bugs.

Paolo

Jason Johnson

15 Mar 15 Mar

6:25 p.m.

New subject: #fork and deterministic resumption of the resulting process

On Wed, Feb 6, 2008 at 9:26 PM, Michael van der Gulik mikevdg@gmail.com wrote:

...

Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

If that pointless day ever comes both your code will break. Share-state programming is just to complicated for mortals to do right consistently.

Michael van der Gulik

16 Mar 16 Mar

8:44 p.m.

New subject: #fork and deterministic resumption of the resulting process

On Sun, Mar 16, 2008 at 6:25 AM, Jason Johnson jason.johnson.081@gmail.com wrote:

...

On Wed, Feb 6, 2008 at 9:26 PM, Michael van der Gulik mikevdg@gmail.com wrote:

...
Er... I can't tell if you're joking or if you're serious.

I always write my code under the assumption that some day, in the future, some bright spark will write a better VM that will run separate Smalltalk Processes on separate OS processes/threads by whatever means, thus better utilising multi-cored or multiple CPUs.

When that day comes, my code will run faster and your code will break.

If that pointless day ever comes both your code will break. Share-state programming is just to complicated for mortals to do right consistently.

I don't seem to have much trouble with it myself.

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

Paolo Bonzini

5 Feb 5 Feb

11:36 a.m.

Andreas Raab wrote:

...

Paolo Bonzini wrote:

...
I'm with Terry on the correct idiom to use, i.e.
workerProcess := [self runWorkerProcess] newProcess.
workerProcess resume.
Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

Can you explain better?

My problem with your solution is that in principle the priority-1 process that you create could never be scheduled if you have two processes already running at the "right" priority. So you trade one "almost never happens" situation with another.

Paolo

Igor Stasenko

4:43 p.m.

On 05/02/2008, Andreas Raab andreas.raab@gmx.de wrote:

...

Paolo Bonzini wrote:

...
I'm with Terry on the correct idiom to use, i.e.
workerProcess := [self runWorkerProcess] newProcess.
workerProcess resume.
Sigh. One of the problems with examples is that they are ... well examples. They are not the actual code. The above solution is simply not applicable in our context (if it were, I would agree with it as the better solution).

Andreas, you looking at problem from wrong angle, i think. Even in C you have separate thread creation and it's resuming: (this code if from socket plugin win32 platform code) ----- asyncLookupHandle = CreateThread(NULL, /* No security descriptor */ 0, /* default stack size */ (LPTHREAD_START_ROUTINE) &sqGetHostByName, /* what to do */ (LPVOID) PLUGIN_IPARAM, /* parameter for thread */ CREATE_SUSPENDED, /* creation parameter -- create suspended so we can check the return value */ &id); /* return value for thread id */ if(!asyncLookupHandle) printLastError(TEXT("CreateThread() failed")); /* lookups run with normal priority */ if(!SetThreadPriority(asyncLookupHandle, THREAD_PRIORITY_NORMAL)) printLastError(TEXT("SetThreadPriority() failed")); if(!ResumeThread(asyncLookupHandle)) printLastError(TEXT("ResumeThread() failed")); ----- See, first you obtain handle, do checks, setting it's priority e.t.c, and only then issue ResumeThread() to enable it running.

So, why in smalltalk things should look different? The point is, that once you enabling process to run, you can't guarantee that current process will continue running and will not be preempted by forked one. And making changes to get favor of one execution thread over another will not solve the problem, because your solution may fit for your purposes, but don't fit for others.

Simply don't assume that any bit of code having chance to be executed in process which created new process by issuing #fork before forked process is actually started execution.

...

[BTW, I'm gonna drop out of this thread since it's clear that there is too much opposition for such a change to get into Squeak. Which is fine by me - I'll wait until you will get bitten in some really cruel and unusual ways and at that point you might be ready to understand why this fix is valuable. Personally, I think that changes that take out an unusual case of non-determinism like here are always worth it - if behavior is deterministic you can test it and fix it. If it's not you might get lucky a hundred times in a row. And in the one critical situation it will bite you].

Cheers,

Andreas

-- Best regards, Igor Stasenko AKA sig.

merlyn＠stonehenge.com

4:56 p.m.

...

...
...
...
...
"Igor" == Igor Stasenko siguctua@gmail.com writes:

Igor> Simply don't assume that any bit of code having chance to be executed Igor> in process which created new process by issuing #fork before forked Igor> process is actually started execution.

Yeah, my preliminary conclusion has been swayed in face of further evidence. I don't see any way to fix even a common breakage around this.

a := [something] fork

can never be guaranteed to put something into a before something starts running. it would violate what fork is doing. If you want a lower priority, forkAt: or simply create it suspended until ready.

Martin Beck

5:40 p.m.

Randal L. Schwartz wrote:

...

a := [something] fork

can never be guaranteed to put something into a before something starts running. it would violate what fork is doing. If you want a lower priority, forkAt: or simply create it suspended until ready.

Correct, and as Andreas said, his solution uses the current implementation of the squeak scheduler. In theory, there is no guarantee, that a new process with lower priority runs after my process just because it has lower priority. This completely depends on the used scheduler. So perhaps it would be better to fix the particular places in code to use #newProcess and #resume.

Perhaps [ doSomething ] fork is simply too beautiful for this use case...

However, again another prove for preferring share-nothing-concurrency: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Regards, Martin

Bert Freudenberg

5:07 p.m.

On Feb 5, 2008, at 16:43 , Igor Stasenko wrote:

...

Even in C you have [...]

So, why in smalltalk things should look different?

Err, because we use Smalltalk precisely because we do not want to deal with all the low-level stuff?

- Bert -

Paolo Bonzini

5:12 p.m.

...

...
Even in C you have [...]

So, why in smalltalk things should look different?

Err, because we use Smalltalk precisely because we do not want to deal with all the low-level stuff?

Race conditions are not low-level stuff.

Paolo

Igor Stasenko

5:13 p.m.

On 05/02/2008, Bert Freudenberg bert@freudenbergs.de wrote:

...

On Feb 5, 2008, at 16:43 , Igor Stasenko wrote:

...
Even in C you have [...]

So, why in smalltalk things should look different?

Err, because we use Smalltalk precisely because we do not want to deal with all the low-level stuff?

That's true of course, but this problem lies in algorithmic plane (an order of operations) so implementation language don't really matter.

...

Bert -

-- Best regards, Igor Stasenko AKA sig.

Louis LaBrunda

9:47 p.m.

Hi,

I ran into a similar problem in VAST that I got around by extending Block and Process. I extended Block with #forkReady, #forkReadyAt: and some others and Process with #ready. The #forkReady extension to Block basically does what it says, it creates a new fork or process in the ready state by does not start it immediately. The new process will wait for its normal turn to start. The #ready extension to Process makes the process ready without running it immediately and is used by #forkReady.

I didn't want to change the way #fork worked so I made the extensions, but I don't see any reason why #fork couldn't be changed to work the way I made #forkReady work. I would be surprised if anyone was counting on #fork immediately starting the new process before the process that creates the fork would yield for other reasons.

Lou

On Mon, 04 Feb 2008 13:04:13 -0800, Andreas Raab andreas.raab@gmx.de wrote:

...

Hi -

In my never-ending quest for questionable behavior in multi-threaded situations just today I ran into a pattern which is dangerously common in our code. It basically goes like this:

MyClass>>startWorkerProcess "worker is an instance variable" worker:= [self runWorkerProcess] fork.

MyClass>>runWorkerProcess "Run the worker process" [Processor activeProcess == worker] whileTrue:[ "...do the work..." ].

MyClass>>stopWorkerProcess "Stop the worker process" worker := nil. "let it terminate itself"

Those of you who can immediately tell what the problem is should get a medal for an outstanding knack of analyzing concurrency problems ;-)

For the rest of us, the problem is that #fork in the above is not deterministic in the way that there is no guarantee whether the "worker" variable will be assigned when we enter the worker loop. It *would* be deterministic if the priority were below or above the current process' priority but when it's the same it can be affected by environmental effects (external signals, delay processing etc) leading to some very obscure runtime problems (in the above, the process would just not start).

To fix this problem I have changed BlockContext>>fork and BlockContext>>forkAt: to read, e.g.,

BlockContext>>fork "Create and schedule a Process running the code in the receiver." ^self forkAt: Processor activePriority

BlockContext>>forkAt: priority "Create and schedule a Process running the code in the receiver at the given priority. Answer the newly created process." | forkedProcess helperProcess | forkedProcess := self newProcess. forkedProcess priority: priority. priority = Processor activePriority ifTrue:[ helperProcess := [forkedProcess resume] newProcess. helperProcess priority: priority-1. helperProcess resume. ] ifFalse:[ forkedProcess resume ]. ^forkedProcess

This will make sure that #fork has (for the purpose of resumption) the same semantics as forking at a lower priority has.

What do people think about this?

Cheers,

Andreas

----------------------------------------------------------- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com

Paolo Bonzini

10:12 p.m.

...

I ran into a similar problem in VAST that I got around by extending Block and Process. I extended Block with #forkReady, #forkReadyAt: and some others and Process with #ready. The #forkReady extension to Block basically does what it says, it creates a new fork or process in the ready state by does not start it immediately. The new process will wait for its normal turn to start. The #ready extension to Process makes the process ready without running it immediately and is used by #forkReady.

I bet that your #forkReady has the same race condition problem that Andreas is trying to solve in Squeak's #fork. :-(

Paolo

Igor Stasenko

10:48 p.m.

Let's look at the fork from theoretical POV, not practical implementation. When developer puts fork in code, he supposing that execution will spawn another, parallel process:

/-------- original process --------- -------- spawned fork

--------> time

Suppose, in ideal environment both processes start running in parallel, and there is no scheduling, preemption and another gory details.

So, it's obvious: if you need some state to be consistent in both execution threads, you should make sure to initialize it before fork happens.

Andreas gave a good example of typical error, when you writing a concurrent code: trying to initialize the state after doing fork, making assumption that we not working under ideal environment and bound solution to specific implementation (knowledge on how scheduler works e.t.c.).

Also, if changes, what Andreas proposed will take place, then fork will be not a fork anymore (in original context), since it's semantics puts much favor on execution of original process.

Also, in case when you will need/want to port your code on another smalltalk dialect, your code have a good chances to not work correctly. And if you may know, catching such errors can take a lot of time.

-- Best regards, Igor Stasenko AKA sig.

Louis LaBrunda

11:48 p.m.

On Tue, 05 Feb 2008 22:12:37 +0100, Paolo Bonzini bonzini@gnu.org wrote:

...

...
I ran into a similar problem in VAST that I got around by extending Block and Process. I extended Block with #forkReady, #forkReadyAt: and some others and Process with #ready. The #forkReady extension to Block basically does what it says, it creates a new fork or process in the ready state by does not start it immediately. The new process will wait for its normal turn to start. The #ready extension to Process makes the process ready without running it immediately and is used by #forkReady.

I bet that your #forkReady has the same race condition problem that Andreas is trying to solve in Squeak's #fork. :-(

Paolo

I haven't shown any code because VAST and Squeak are different enough in this area that my #forkReady code would not be helpful but if there is still a race condition is should be greatly reduced because the #forkReady code puts the newly created process in the queue of ready processes and returns. The setting of the instance variable would have to be interrupted immediately by a task switch, caused by a timer or some other interrupt. At least the task switch would not because by the fork code itself. Andreas' suggested code looks like it is trying to delay the task switch to the new process by what seems to me to be a somewhat convoluted means (sorry Andreas, no offence intended).

Lou ----------------------------------------------------------- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com

Andreas Raab

11:52 p.m.

Louis LaBrunda wrote:

...

I haven't shown any code because VAST and Squeak are different enough in this area that my #forkReady code would not be helpful but if there is still a race condition is should be greatly reduced because the #forkReady code puts the newly created process in the queue of ready processes and returns. The setting of the instance variable would have to be interrupted immediately by a task switch, caused by a timer or some other interrupt.

Which is *precisely* the problem we're talking about. Can you imagine how annoying it is that the code works in all your tests and when you ship it to a couple hundred customers it blows up just for statistical reasons?

Cheers, - Andreas

Joshua Gargus

11:57 p.m.

On Feb 5, 2008, at 1:12 PM, Paolo Bonzini wrote:

...

...
I ran into a similar problem in VAST that I got around by extending Block and Process. I extended Block with #forkReady, #forkReadyAt: and some others and Process with #ready. The #forkReady extension to Block basically does what it says, it creates a new fork or process in the ready state by does not start it immediately. The new process will wait for its normal turn to start. The #ready extension to Process makes the process ready without running it immediately and is used by #forkReady.

I bet that your #forkReady has the same race condition problem that Andreas is trying to solve in Squeak's #fork. :-(

I think you misunderstand (unless I misunderstand)... it sounds like the Process created by #forkReady doesn't start until explicitly resumed:

worker := [self workerLoop] forkReady. worker resume.

Also, Andreas is not trying to solve a race condition (for example, the race condition still exists if you fork at a higher priority). He's just trying to make it deterministic: make it either work all the time (as with forking a lower-priority process, or with Andreas's patch, a process of the same priority), or fail all of the time (as with forking a higher-priority process). This makes it easier to debug than having something that works 9999 times out of 10000.

Josh

...

Paolo

Joshua Gargus

6 Feb 6 Feb

12:02 a.m.

On Feb 5, 2008, at 2:57 PM, Joshua Gargus wrote:

...

On Feb 5, 2008, at 1:12 PM, Paolo Bonzini wrote:

...
...
I ran into a similar problem in VAST that I got around by extending Block and Process. I extended Block with #forkReady, #forkReadyAt: and some others and Process with #ready. The #forkReady extension to Block basically does what it says, it creates a new fork or process in the ready state by does not start it immediately. The new process will wait for its normal turn to start. The #ready extension to Process makes the process ready without running it immediately and is used by #forkReady.

I bet that your #forkReady has the same race condition problem that Andreas is trying to solve in Squeak's #fork. :-(

I think you misunderstand (unless I misunderstand)... it sounds like the Process created by #forkReady doesn't start until explicitly resumed:

I'm wrong again, I misunderstood how #forkReady is supposed to work. I don't know how the VAST scheduler works, so maybe there is no race- condition, but it sounds very similar to the Squeak race-condition that we're talking about.

I should really stop posting on this topic.

Josh

...

...
Paolo

5908

Age (days ago)

5949

Last active (days ago)

squeak-dev@lists.squeakfoundation.org

108 comments

19 participants

tags (0)

participants (19)

Andreas Raab
Bert Freudenberg
Blake
Igor Stasenko
Jason Johnson
John Brant
Joshua Gargus
Keith Hodges
Louis LaBrunda
Martin Beck
Mathieu Suen
merlyn＠stonehenge.com
Michael van der Gulik
nicolas cellier
Paolo Bonzini
stephane ducasse
Terry Raymond
tim Rowledge
Yoshiki Ohshima