On 23/10/2007, Jason Johnson jason.johnson.081@gmail.com wrote:
Problem: your example is using shared data and updating of variables. In the message passing paradigm *there is no shared data*. Period. None. In Erlang specifically there isn't updating of variables even within a process. So this would be done in Erlang something like this:
some_process(DataStructure) -> break_up_structure(DataStructure, 10000), get_new_structure({}, 10000). % return result of get_new_structure
break_up_structure(_, 0) -> done; % base case, no processes left break_up_structure(DataStructure, Processes) -> % otherwise RestOfDataStructure = split_and_send(DataStructure), % cut off a piece and send break_up_structure(RestOfDataStructure, Processes - 1). % tail call with new values
get_new_structure(DataStructure, 0) -> DataStructure; % base case, return what we built get_new_structure(DataStructure, Processes) -> Data = receive, %psuedo code for brevity NewDataStructure = add_data_to_structure(Data, DataStructure), get_new_structure(NewDataStructure, Processes - 1).
The fact that variables are immutable is dealt with in the normal functional programming way of using tail recursion and passing any variables that need "updating" as arguments.
In case the above code isn't clear: The process breaks up the parts of the data structure and farms them out to the different processes, then waits for responses and incrementally assembles them into the new data structure.
Now, the issue here is obviously: This only makes sense when the processing of the data that was carved out is more expensive then the carving out and reattaching. If the structure is very large that may well not be the case.
In that case I'm not sure how I would handle it, but I look at it like any other performance issue: I would try algorithm changes before I looked at going to a lower level.
Now someone mentioned Software Transactional Memory (STM) so briefly that it would be easy to miss. Is that your solution?
No, if someone else wants to look at this it's ok. I'm a bit concerned about the book keeping.
If so you still have other concurrency issues, object versioning issues, plus more to deal with. No solution is a panacea for all problems unless you are an advocate of silver bullet solutions.
There is no such thing, but just as a generational garbage collector is "good enough" in all but the most special cases, I believe message passing will be "good enough" as well.
This having a perspective, only if you have unlimited memory resources and zero cost memory allocation. Lets look more precise on this. I will write only in ST(i don't know Erlang), and assuming that i understood well your concept , by having following ST code:
SomeClass>>setVars self setVar1: value1. self setVar2: value2. ... ^ self
here at each message send , instead of writing to receiver memory, we do copy-on-write cloning. so, self setVar1: value1 will return us a modified copy - self' , To keep things semantically correct, then we substitute self in 'self setVar2: value2' by just received copy and so on.. at the end by returning self we substitute it by self''''''' . So, each time we modifying object we got a modified copy instead modifying original. Now think about costs: memory allocation and orders of magnitude more garbage generated. Now, even if we assume that each process haves own private memory region, its still should be located somewhere in physical memory. And as you may know, a physical memory is shared among all cores, so your 'topmost' memory manager have no excuses, but to use so disliked by you locking to deal with concurrent requests for resources. And as you may see from this example, this model really fast going to, that memory manager will become great bottleneck of your model, because of orders of magnitude higher memory consumption.
And now consider alternative: even by putting a dumb lock-write-unlock we can have much less cycles wasted. Because in your 'non-locking' model your main load is just producing tons of garbage by cloning objects over and over.