I wish you had gotten involved in this thread earlier on. I think you explained everything better in this one message then I have in the whole thread. :)
On 10/30/07, Andreas Raab andreas.raab@gmx.de wrote:
Igor Stasenko wrote:
How would you define a boundaries of these entities in same image?
It is defined implicitly by the island in which a message executes. All objects created by the execution of a message are part of the island the computation occurs in.
To create an object in another island you need to artificially "move the computation" there. That's why islands implement the #new: message, so that you can create an object in another island by moving the computation, for example:
space := island future new: TSpace.
This will create an instance of TSpace in the target island. Once we have created the "root object" further messages that create objects will be inside that island, too. For example, take this method:
TSpace>>makeNewCube "Create a new cube in this space" cube := TCube new. self addChild: cube. ^cube
and then:
cube := space future makeNewCube.
Both, cube and space will be in the same island.
Could you illustrate by some simple examples, or strategy which can be used for using them for concurrent execution within single VM?
I'm confused about your use of the term "concurrent". Earlier you wrote "There is a BIG difference between concurrency (parallel execution with shared memory) and distributed computing." which seems to imply that you discount all means of concurrency that do not use shared memory. If that is really what you mean (which is clearly different from the usual meaning of the term concurrent) then indeed, there is no way for it to be "concurrent" because there simply is no shared mutable state between islands.
I'm very interested in practical usage of futures myself. What will you do, or how you would avoid the situation , when sometimes a two different islands containing a reference to the same object in VM will send direct messages to it, causing racing condition?
The implementation of future message sending uses locks and mutexes. You might say "aha! so it *is* using locks and mutexes" but just as with automatic garbage collection (which uses pointers and pointer arithmetic and explicit freeing) it is simply a means to implement the higher-level semantics. And since no mutual/nested locks are required deadlock-freeness can again be proven.
Yes. But this example is significant one. Sometimes i want these messages run in parallel, sometimes i don't. Even for single 'island'.
In the island model, this is not an option. The unit of concurrency is an island, period. If want to run computations in parallel that share data you either make the data immutable (which can enable sharing in some limited cases) or you copy the needed data to "worker islands". Basic load balancing.
Then, for general solution we need these islands be a very small (a smaller one is a single object) or contain big number of objects. The question is, how to give control of their sizes to developer. How developer can define a boundaries of island within single image?
By sending messages. See above.
I will not accept any solutions like 'multiple images' because this drives us into distributed computing domain, which is _NOT_ concurrent computing anymore, simple because its not using shared memory, and in fact there is no sharing at all, only a glimpse of it.
Again, you have a strange definition of the term concurrency. It does not (neither in general english nor in CS) require use of shared memory. There are two main classes of concurrent systems, namely those relying on (mutable) shared memory and those relying on message passing (sometimes utilizing immutable shared memory for optimization purposes because it's indistinguishable from copying). Erlang and E (and Croquet as long as you use it "correctly") all fall into the latter category.
This may be the outcome for an interim period. The good thing here is that you can *prove* that your program is deadlock-free simply by not using waits. And ain't that a nice property to have.
you mean waits like this (consider following two lines of code run in parallel):
[ a isUnlocked ] whileFalse: [ ]. b unlock.
and
[ b isUnlocked] whileFalse: []. a unlock.
Just like in your previous example, this code is meaningless in Croquet. You are assuming that a and b can be sent synchronous messages to and that they resolve while being in the busy-loop. As I have pointed out earlier this simply doesn't happen. Think of it that way: Results are itself communicated using future messages, e.g.,
Island>>invokeMessage: aMessage "Invoke the message and post the result back to the sender island" result := aMessage value. "compute result of the message" aMessage promise future value: result. "resolve associated promise"
so you cannot possibly wait for the response to a message you just scheduled. It is simply not possible, neither actively nor passively.
And how could you guarantee, that any bit of code in current ST image does not contain such hidden locks - like a loops or recursive loops which will never return until some external entity will change the state of some object(s)?
No more than I can or have to guarantee that any particular bit of the Squeak library is free of infinite loops. All we need to guarantee is that we don't introduce new dependencies, which thanks to future messages and promises we can guarantee. So if the subsystem is deadlock free before it will stay so in our usage of it. If it's not then, well, broken code is broken code no matter how you look at it.
I pointed that futures as an 'automatic lock-free' approach is not quite parallel to 'automatic memory management by GC'.
The similarity is striking. Both in terms of tradeoffs (trade low-level control for better productivity) as well as the style of arguments made against it ;-) Not that I mind by the way, I find these discussions necessary.
The striking is, that introducing GC does good things - removing a necessity to care about memory, which helps a lot in developing and makes code more clear and smaller. But i can't see how futures does same. There are still lot things to consider for developer even by using futures.
The main advantages are increased robustness and productivity. We worry a *lot* about deadlocks since some our usage of Croquet shows exactly the kind of "mixed usage" that you pointed out. But never, not once, have we had a deadlock or even have had to worry about it, in places where we used event-loop concurrency consistently. (interesting aside: Just today we had a very complex deadlock on one of our servers and my knee-jerk reaction was to try to convert it to event-loop concurrency because although we got stack traces we may not be able to completely figure out how the system ended up in that deadlock :-)
We've gradually continued to move to event-loop concurrency more and more in many areas of our code because the knowledge that this code will be deadlock-free allows us to concentrate on solving the problem at hand instead of figuring out the most unlikely occurences that can cause deadlock - I suspect that I'll be faster rewriting the code from today as event-loops than figuring out what caused and how to avoid that deadlock.
And that is in my understanding the most important part - how many hours have you spent thinking about how exactly a highly concurrent system could possibly deadlock? What if you could spend this time on improving the system instead, knowing that deadlock *simply cannot happen*?
Cheers,
- Andreas