On 10/23/07, Peter William Lount peter@smalltalk.org wrote:
The principle is that anytime you have more than one thread or process working on the same memory space, or object space, you WILL have concurrency issues (unless your code is just running very simple concurrency). The point is that in order to implement your utopia-vision-of-simple-problem-free-concurrency (utopia-concurrencia for lack of a better name) in Smalltalk you MUST isolate the objects to ONLY ONE thread of possible alteration of their state otherwise you end up with the possibility of many classes of concurrency problems.
Yes, this is mostly true. The insight with Erlang is that they don't actually have to be in a different memory space, it just has to be impossible at the language level for one process to get a reference to an object of another process *and modify it*.
Shared memory problems exist even within one protected memory space and not just between them. To isolate the objects involved in a process you can have a separate object space which contains the objects that will be operated on. This is the Erlang way, isn't it?
Kind of. The Erlang approach works so well for them because variables can't be changed. Once you create a variable it is frozen in that form. Other process *can* look at it because no change can happen from either thread.
Obviously more care will have to be taken in Smalltalk as the objects can always be changed.
The thing about Erlang, unless I'm mistaken (and if I am mistaken I'd expect to be corrected), is that the objects in a process are only visible to that process until the results are returned. The objects that pass in and out of an Erlang process are only primitive data types and not complex objects.
The last sentence is incorrect. The message can be any complexity, including sending functions, file handles, whatever.
However for Smalltalk you'd need to pass in complex object graphs of arbitrary size and connectedness to be general purpose. This then results in a version problem.
Only if what you pass can be modified by either side.
For example, <snip> Now you've got a problem that the magical erlang message passing won't solve.
Problem: your example is using shared data and updating of variables. In the message passing paradigm *there is no shared data*. Period. None. In Erlang specifically there isn't updating of variables even within a process. So this would be done in Erlang something like this:
some_process(DataStructure) -> break_up_structure(DataStructure, 10000), get_new_structure({}, 10000). % return result of get_new_structure
break_up_structure(_, 0) -> done; % base case, no processes left break_up_structure(DataStructure, Processes) -> % otherwise RestOfDataStructure = split_and_send(DataStructure), % cut off a piece and send break_up_structure(RestOfDataStructure, Processes - 1). % tail call with new values
get_new_structure(DataStructure, 0) -> DataStructure; % base case, return what we built get_new_structure(DataStructure, Processes) -> Data = receive, %psuedo code for brevity NewDataStructure = add_data_to_structure(Data, DataStructure), get_new_structure(NewDataStructure, Processes - 1).
The fact that variables are immutable is dealt with in the normal functional programming way of using tail recursion and passing any variables that need "updating" as arguments.
In case the above code isn't clear: The process breaks up the parts of the data structure and farms them out to the different processes, then waits for responses and incrementally assembles them into the new data structure.
Now, the issue here is obviously: This only makes sense when the processing of the data that was carved out is more expensive then the carving out and reattaching. If the structure is very large that may well not be the case.
In that case I'm not sure how I would handle it, but I look at it like any other performance issue: I would try algorithm changes before I looked at going to a lower level.
Now someone mentioned Software Transactional Memory (STM) so briefly that it would be easy to miss. Is that your solution?
No, if someone else wants to look at this it's ok. I'm a bit concerned about the book keeping.
If so you still have other concurrency issues, object versioning issues, plus more to deal with. No solution is a panacea for all problems unless you are an advocate of silver bullet solutions.
There is no such thing, but just as a generational garbage collector is "good enough" in all but the most special cases, I believe message passing will be "good enough" as well.
The problem of editing a large graph of objects with many parallel threads is the generalized case of a nasty and complex set of concurrency and transactional issues. There are many ways to solve this. If you reply to this example I would hope that you do so fully explaining how you'd handle the concurrency and - importantly - the object consistency issues.
Transactional and concurrency issues arise because you are sharing something. If you give one entity alone access to that something and all access must go through him these issues go away. They are traded for new issues, but issues that are much easier to reason about.
Yes, I understand that early tests indicate that Erlang can handle approximately 100,000 or so processes at a time without hickups while Java can handle about 8,000 or so before blowing up.
No where near 8,000. At least not on any box I've ever seen (or do you have a reference?). The problem is Java's just too fat, on a 32-bit operating system you run out of memory well before 8k processes or threads.
I don't know what the various Smalltalks can handle, but I doubt it's as high as Erlang and is more likely less than even Java - just a guess though. Maybe someone has worked it out.
Actually Smalltalk is not so far from Erlang right now (theoretically. The question mark is the scheduling). Erlang is optimized for this so the size of each process might be half the size of a Smalltalk one (but I'm not sure of even this), but it's *certainly* much higher then any native process or thread solution can hope to achieve.
That's only because the current crop of operating systems were designed and envisioned when a few hundred processes and threads was considered a lot. Also because native operating system processes take a lot of resources.
It's because of the resources and how the OS deals with them. Keep in mind that a thread can call "detach" and become a running process, so some care has to be taken that space will be available. Of course linux deals with this by not having real threads at all, just processes that have the same memory map as other processes.
Yes, and how would the no sharing be implemented in Smalltalk?
This is what my investigations will reveal. As I alluded to in a previous mail, any immutable data is not a concurrency issue. It doesn't matter who can see it so long as it can't be updated. Mutable data (e.g. objects) are also no issue provided you can guarantee no process can get access to it besides the process that created it.
So that leaves globals, especially classes. Until I get into this I'm now 100% how I'll deal with it, but I can't image that it's not solvable.
How would you solve the concurrency one million node editing problem above without locking in your utopian threading implementation?
As described above.
What would you do to Smalltalk to make it do this. So far you and the others have been very short on specifics and have just argued that something magical can be done to make concurrency happen without locks.
With current hardware/OSes, there will be locks, but in the VM where they belong. The only structure in Erlang that must be atomic is the message "mailbox", it's the only place that should can be accessed at the same time by multiple processes.
A few papers and web sites have been linked to but no one has written down what they are proposing or what they mean past it can be done.
Well, I'm a Smalltalker. I form a vague idea and then go try to do it. I'll let you know what the specification is when I've implemented it. :) But I have researched into this as far as what exists today and I haven't seen anything I feel is a show stopper, nor anything that will require a change in Smalltalk semantics. It's very possible (even likely) that there's something I've overlooked, but I'll need to get into it to find that out.
I'll grant you that you can see that it can be done. Please illuminate what it is that you see can be done in detail and how you might do it. Thanks.
Is it clearer now? I feel that I have detailed it out twice now (the relevant details anyway).
However, you'll still end up with concurrency control issues and you've got an object version explosion problem occurring as well. How will you control concurrency problems with your simplified system? Is there a succinct description of the way that Erlang does it? Would that apply to Smalltalk?
Can you give an example of one of these issues, so I can explain how I would deal with it? Please note, there is *no data sharing, period* in this paradigm. At least at the language level.
Ok, so there would be 10,000 separate process-object-spaces with the one million nodes being edited and new nodes being created in each of these 10,000 separate spaces. How do you expect to "merge" the results and solve the edits that will inevitably cause "logical data inconsistency" collisions?
By having just one process that owns the data (or lots of processes that own their own piece of it) that all processes must talk to if they wish to make changes.
You simplified concurrency system also dramatically alters the Smalltalk paradigm.
The current paradigm is fine-grained locked/shared state.
So?
So obviously this part of the current paradigm will be altered, and I say it needs to be. Even if we find that certain parallel tasks need the old shared state method, this shouldn't be provide anywhere most people will find it. The problem is that most people who know how to do concurrency code only know this shared state model, so if you present multiple options they will all use this, the familiar.
Why? Please provide more than anticidal or belief driven comments for this point of view. What are the reasons? What is it that you'd be moving towards?
Because of the reasons I've laid out several times in this thread: 1) it does not scale, 2) it can not be composed, 3) it's incredibly difficult to reason about, 4) it's a low level detail, 5) it ensures encapsulation violation and on and on.
There are plenty of papers out there on this subject, if you are looking for me to go through them all and condense it for you in a summary more then I've already done then I'm afraid that's not going to happen. It's a pretty well known fact that shared-state fine grained locking *can not scale*.
It's a huge mistake on their part in my humble view.
While it may be easy from the point of view of adapting their image it's a huge mistake. I've had many people comment that that's one of the reasons that Java is better than Smalltalk
If someone thinks that mess that is Java is better then Smalltalk, I already question what useful information they can bring to the table. Java has *some things* better then Smalltalk sure, but such a statement is an "information smell" or a "taste smell" to say the least.
- it already works with multiple cpu
cores. Yes they have to solve the concurrency problems, but those are NO WORSE than the concurrency problems that already exist within Smalltalk when running with a single native process and multiple (green threads aka) Smalltalk Processes. No different. Do you actually get that?
For someone who is so violently against personally attacks, you sure hang over the fence, eh? Just because you're not understanding where I'm coming from doesn't mean these concepts are just beyond me and only you get it.
If you don't then you fail to appreciate that the approach that Cincom is taking isn't going to solve the concurrency problems since - unless they correct me on this - it seems that their direction is to simply have N-instances of their image (in the same memory space or in separate operating system processes) where N would frequently be the same as the number of cores on the computer (or server) in question (although the instances could be more or less as needed).
Which *does* solve it! And conveniently walks right past all the terrible issues that shared-state concurrency programming has. Once again, while Java people are trying to debug issues the Smalltalk guys will already be adding features to the next release.
Each individual image would still have the problems of multi-threading within it IF AND ONLY IF there are multiple threads forked.
Right, so don't do that. :)
This is of course a far cry from the radical concurrency system that is being proposed by the erlangization concurrency proponents.
Actually not so much. Erlang spanned actual CPU's by running more images, just like Smalltalk. So only the processes inside the image are different, but even this can be done today with discipline. I would like to remove this need for discipline by making it *impossible* to affect other processes, but so long as you make sure you don't update anything that other processes can see you could do this kind of message passing today.
There isn't any need for new syntax with the "!" character. Now sure you're using it with a binary message selector "!" but why obfuscate it. I'd recommend using a keyword selector for better clarity. Thanks.
This is just what Erlang uses. I want inter-process sends to stand out clearly.
Not so. You'd have to transmit - in my example above - one million objects to the various images and have them compute and return their resutls which would then have to be combined in a manner that leaves the graph of objects in a consistent state with one and a half million objects and 70% more interconnections between them. It is this parallel updating of many parts of the same data graph that will require the concurrency controls.
No. You are describing shared data which doesn't exist. No shared data = no locking needed.
Nothing but you've got to address the concurrency problem that I've mentioned above.
It wasn't a problem with message passing style, but for shared-state concurrency programming.
Are you talking about forking a new operating system process with a copy of the image?
I'm talking about: [ "some code" ] fork
These are object database problems and attempting to split the processing into multiple threads to avoid the "locking" issues does not solve the problem. It just pushes it further away. While it might work for some applications like telephone switching systems it can't generalize to ALL types of problems which could benefit from concurrency solutions.
No it can't, and I don't believe I ever said it did. But garbage collection can't either and we do fine with that as our only option in Squeak. If we need more we step outside the normal bounds, as it should be.
All Object Databases have a couple of rooted objects. Maybe many more than a couple.
Object databases are a whole other can of worms. I don't know how I would deal with it, but I would start by looking at what Mnesia (basically an object db for Erlang) does.
Yes, a variant of the Software Transactional Memory. However, you still have the problems mentioned above.
No, Software transactional memory means we update several variables inside an "automic" block and if the system notices something changed while these changes were being made it rolls the block back to what it was before.
I'm talking about a VM optimization to deal with metaclasses.
Having two spaces, old and new space, won't solve the problems mentioned above when you have N processes (threads) running on M-objects in parallel and need to combine the results of the parallel computations.
Old space and new space is purely for dealing with live code updates. Nothing more. I'm not trying to solve any object versioning issues, because I haven't seen any real evidence they will exist.
Many problems have this "split processes off with their chunk of data" and "recombine" the results. Many of these problems are simplified - if possible
- so that the results can't collide with the issues presented above.
However, we are not talking about those special cases - such as parallel ray tracing algorithms. We are talking about the completely generic cases that occur in general purpose and every day use of code in Smalltalk applications
The only things I can think of that wouldn't work in this model is problems where splitting up and rebuilding a dataset is more expensive then the actual processing. But I think can usually be solved by design changes.
- such as the massive Smalltalk business database front end applications
which are typical at many corporations today and which utilize many threads to accomplish their parallel tasks in order to speed up the user experience. A real world consequence of this is increased productivity of thousands of users day in and day out at these corporations.
I'm not sure what you're saying here. Apparently these Smalltalk applications aren't doing real multithreading now right (since it's only an option on a few ST implementations)? So how is offering a simple way to achieve concurrency going to make this worse?
Maybe your applications aren't a complex as these but I don't see the benefits of an Erlang ONLY approach. I do see the benefit of STM and Erlang approaches in some cases but why intentionally limit the tool box to just a few cases? It makes no sense to ignore the harsh reality of concurrency issues by picking a limited set of solutions.
For the reasons mentioned above. Choice isn't the holy grail you seem to think it is. If it was we would all be on the C++ list talking. Funny that we ended up in a language that 1) doesn't allow you to allocate your own memory, 2) forces you to use single inheritance, 3) forces you to use an image instead of files, etc.
I'm comfortable with the simplest thing being what works 90-99% the time and having to work much harder if I need something more.