I see many reasons why the difference is smaller: - you're also measuring the generation of the input - this creates new numbers and triggers GC more often - you're only benchmarking numbers between 0 and 1. #asIEEE32BitWord is a lot slower for negative values - you're using #bench, which has high overhead - you are comparing different versions than I did
About your modifications: self = NaN will always return false, so that comparison is wrong. self == NegativeZero will almost never be true (try -0.0 == Float negativeZero). Use #= instead.
After trying to understand what the code is about to do, I came to the conclusion that there's no reason to treat negative infinity and infinity separately.
hashKey32
self > 0.0 ifTrue: [ ^16r80000003 + ((FloatArray basicNew: 1) at: 1 put: self; basicAt: 1) ]. self < 0.0 ifTrue: [ ^16rFF800000 - ((FloatArray basicNew: 1) at: 1 put: self; basicAt: 1) ]. self = self ifFalse: [ ^16rFFFFFFFF "NaN" ]. (self at: 1) = 0 ifTrue: [ ^16r80000003 "Zero" ]. ^16r7F800000 "Negative zero"
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
Here is what I used to measure:
| rand | rand := Random seed: 12345. [ (rand next ) hashKey32 ] bench
This baseline version reports '902,000 per second.'
hashKey32 self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. "Identity check to allow a distinction between -0.0 and +0.0." self == NegativeZero ifTrue: [ ^ 2147483650 ]. "Smallest to largest negative IEEE 32-bit floats range from
(2147483649 to: 4286578687), so invert that range." self negative ifTrue: [ ^ ("4286578687" 4286578688 - self asIEEE32BitWord) "+ 1" ]. "We're positive. IEEE 32-bit positives range from (0 to: 2139095039)." ^ self asIEEE32BitWord + 2147483651
Switching it to use FloatArray reports '3,530,000 per second.'
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray basicNew: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Do you think the difference is less pronounced than yours due to my going through Random #next?
On Tue, Dec 23, 2014 at 5:52 PM, Levente Uzonyi leves@elte.hu wrote:
I don't know how you measured the speedup, but I got 8-12x improvement for finite numbers. By creating a new FloatArray, the speedup decreases to 6-9x.
Levente
On Tue, 23 Dec 2014, Chris Muller wrote:
On Mon, Dec 22, 2014 at 3:59 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 22.12.2014, at 00:13, Levente Uzonyi leves@elte.hu wrote:
ConverterFloatArray at: 1 put: self; basicAt: 1.
Any reason not to use this in #asIEEE32BitWord? Endianness? Arch-dependency?
I see, it's not thread-safe. This would be:
(FloatArray new: 1) at: 1 put: self; basicAt: 1.
Might still be faster?
Yes. Since creation of a one-element FloatArray every time did not adversely affect performance of Levente's too significantly (only 3.7X instead of 4.0X faster), I decided it was worth the cost of the allocation than to worry about concurrency. So I ended up with Levente's latest except I cannot risk a calculation ending up -0.0, so I have to account for it too. And, NaN too. Thus:
hashKey32 | bits | self = NegativeInfinity ifTrue: [ ^ 0 ]. self = Infinity ifTrue: [ ^ 4294967294 ]. self = NaN ifTrue: [ ^ 4294967295 ]. self = NegativeZero ifTrue: [ ^ 2147483651 ]. bits := (FloatArray new: 1) at: 1 put: self; basicAt: 1. self < 0.0 ifTrue: [ ^ 4286578688 - bits ]. ^ 2147483651 + bits
Since there are not a full 32-bits worth of IEEE 32-bit floats (e.g., several thousand convert to NaN), it might be wise to move +Infinity and NaN _down_ a bit from the very maximum, for better continuity between the float and integer number lines, or for potential future special-case needs..?
In any case, I wanted to at least see if what we have, above, works for every 32-bit IEEE float. To verify that, I enumerated all Floats in numerical order from -Infinity to +Infinity by creating them via #fromIEEE32BitFloat: from the appropriate ranges.
It hit a snag at 2151677948. Check this out:
| this next | this := Float fromIEEE32Bit: 2151677949. next := Float fromIEEE32Bit: 2151677948. self assert: next > this ; assert: ((FloatArray new: 1) at: 1 put: (next); basicAt: 1)
((FloatArray new: 1) at: 1 put: (this); basicAt: 1)
As I thought, the representations between IEEE floats and FloatArray floats are different-enough that their precisions align differently onto the 32-bit map for these two floats. IEEE's are precise-enough to distinguish these two floats, FloatArray representations are not.
That these guys are considered "equal" by the FloatArray is actually good enough for my indexing requirement, but now I'm looking at the prim-fail code for FloatArray:
at: index <primitive: 'primitiveAt' module: 'FloatArrayPlugin'> ^Float fromIEEE32Bit: (self basicAt: index)
If this or the #at:put: primitive were to ever fail on the storage (at:put:) exclusive-or the access (at:) side, then it appears FloatArray itself would retrieve a value different than was stored..!