On 8 Jun 2018, at 9:02, Marcus Denker wrote:
On 8 Jun 2018, at 08:46, Max Leske maxleske@gmail.com wrote:
Hi,
I was messing around with SmallDictionary when I suddenly realised that I can't find a single reason to use it over a normal Dictionary. While its name and class comment imply that it is somehow an optimised Dictionary, I don't see any measurement where that actually holds up. The following was run in a Pharo 7 image on a recent VM (see below):
| d | d := SmallDictionary new. d sizeInMemory. "24" [100000 timesRepeat: [ 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:05.226"
[100000 timesRepeat: [ d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.041"
| d | d := Dictionary new. d sizeInMemory. "16" [100000 timesRepeat: [ 1 to: 100 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:00.385" [100000 timesRepeat: [ d at: 48 ifAbsent: [] ] ] timeToRun. "0:00:00:00.006"
can you try with <10 elements?
The results are in! Note that I had to increase the repetitions by a factor or 100.
| d | d := SmallDictionary new. [10000000 timesRepeat: [ 1 to: 5 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:02.532"
[10000000 timesRepeat: [ d at: 4 ifAbsent: [] ] ] timeToRun. "0:00:00:00.686"
| d | d := Dictionary new. [10000000 timesRepeat: [ 1 to: 5 do: [ :i | d at:i put: i] ] ] timeToRun. "0:00:00:02.102"
[10000000 timesRepeat: [ d at: 4 ifAbsent: [] ] ] timeToRun. "0:00:00:00.567"
Is anyone aware of a reason for hanging on to SmallDictionary? I'm also curious to know how SmallDictionary came to be. There must have been some advantage over Dictionary at some point in the past.
It came from RB… the idea was that there are (in the Refactoring engine) a lot of very small dictionaries with <10 elements. The idea is that for such dictionaries, the overhead of hashing was higher than just linear search.
I am sure I did a benchmark back then, but of course this could have changed in the meantime.
It would be nice to redo the benchmark.
For size: does #sizeInMemory take the internal data into account? e.g for Dictionary there is an Array that contains Associations, you need to count those, too.
In the end this is a hack anyway: if needed Dictionary could do things like this transparently… without the user having to decide. And the question is if this is not just a bad side effect of the particular implementation.
Marcus