Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also? Cheers
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in InterpreterPrimitives. Primitive # 161.
Cheers
-- Mariano http://marianopeck.wordpress.com
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I should have checked before...I always forget about InterpreterPrimitives hahaha So...Pharaoers... do you want the image side of the primitive and some tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
On Tue, Jan 17, 2012 at 2:02 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I should have checked before...I always forget about InterpreterPrimitives hahaha So...Pharaoers... do you want the image side of the primitive and some tests? I can provide that if desired (in my opinion I would include it)
I'd think carefully before including it :) It's for extremely hairy hacking. That said, see the attached for a plausible use
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
After the discussions we got and with a really big comment I would add it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck marianopeck@gmail.com wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I should have checked before...I always forget about InterpreterPrimitives hahaha So...Pharaoers... do you want the image side of the primitive and some tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
I really don't see what good could come of it being available in general…
Cheers, Henry
On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:
After the discussions we got and with a really big comment I would add it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck marianopeck@gmail.com wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I should have checked before...I always forget about InterpreterPrimitives hahaha So...Pharaoers... do you want the image side of the primitive and some tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
On Wed, Jan 18, 2012 at 11:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I think it is a nice feature to have. If you just have it in the VM nobody will see it unless the guy load all VM stuff and checks the code. For example, for certain hacky scenario, I wanted to create proxies which have exactly the same identitiHash as the object they proxify. That primitive let me do that. I was lucky that I asked in the mailing list, otherwise I would have miss it. And it is not that we do not have dangerous methods in the image because we do. So... I would include it since I think it could be useful for someone doing hacky stuff, but as Stef says, I would put a really really clear comment.
Cheers, Henry
On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:
After the discussions we got and with a really big comment I would add
it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.com
wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to
be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in
InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I
should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and some
tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
Cheers, Henry
On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:
After the discussions we got and with a really big comment I would add
it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda eliot.miranda@gmail.com
wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to
be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in
InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I
should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and some
tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
On Wed, Jan 18, 2012 at 8:02 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Indeed, in Fuel we would save the rehash of MethodDictionaries.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
Cheers, Henry
On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:
After the discussions we got and with a really big comment I would add
it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <
eliot.miranda@gmail.com> wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like to
be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in
InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I
should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and some
tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
On Wed, Jan 18, 2012 at 8:58 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Wed, Jan 18, 2012 at 8:02 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate.
I have another one but I am not sure if I can make it clear. In my case, I have original which I do #becomeForward: to proxies. Original objects are then swapped out to disk and garbage collected. Then after, I materialize from disk, and I do proxies #becomeForward: materialized objects. I have no idea whether original objects were stored in hashed collections. Not even in which collections. So...to avoid rehashing all instances of all hashed collections, what I do is the becomeForward: that copies identityHash. Problem is...that when I do original objects becomeForward: proxies ... it change the identityHash of proxies, and what happens if proxies were stored also in hashed collections? would need a rehash...
So...with this new primitive I can directly set the same identityHash to the proxy when I create it, since I know who he will proxify :)
If Symbol instances identity hashes were derived from their string hash
then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Indeed, in Fuel we would save the rehash of MethodDictionaries.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
Cheers, Henry
On Jan 18, 2012, at 8:16 26AM, stephane ducasse wrote:
After the discussions we got and with a really big comment I would add
it.
Stef On Jan 17, 2012, at 11:02 PM, Mariano Martinez Peck wrote:
On Tue, Jan 17, 2012 at 8:04 PM, Eliot Miranda <
eliot.miranda@gmail.com> wrote:
On Tue, Jan 17, 2012 at 8:43 AM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
Hi guys. Becuase of some work I am doing with proxies, I would like
to be able to set a specific identityHash to a proxy instance. I can add this primitive in my VM, but I was thinking if this could be of a general interest also?
It already exists. See primitiveSetIdentityHash in
InterpreterPrimitives. Primitive # 161.
Thank you so much Eliot. You even save my time of coding it ;) I
should have checked before...I always forget about InterpreterPrimitives hahaha
So...Pharaoers... do you want the image side of the primitive and
some tests? I can provide that if desired (in my opinion I would include it)
Cheers
Cheers
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen henrik.s.johansen@veloxit.no wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
So rather than:
Object setIdentityHashTo: aNumber <primitive: 161>
you have:
Symbol >> initialize self deriveIdentityHashFrom: self hash
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
and similar for Mariano's Proxy class.
Cheers, Henry
On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
So rather than:
Object setIdentityHashTo: aNumber <primitive: 161>
you have:
Symbol >> initialize self deriveIdentityHashFrom: self hash
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
and similar for Mariano's Proxy class.
Cheers, Henry
Good idea!
On Wed, Jan 18, 2012 at 10:02 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
So rather than:
Object setIdentityHashTo: aNumber <primitive: 161>
you have:
Symbol >> initialize self deriveIdentityHashFrom: self hash
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
and similar for Mariano's Proxy class.
Cheers, Henry
Good idea!
Wait...I am slower than Eliot ;) so...deriveIdentityHashFrom: should be with the primitive call, right? otherwise I am lost.
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*" <primitive: 161>
?
-- best, Eliot
On Jan 18, 2012, at 11:05 11PM, Mariano Martinez Peck wrote:
On Wed, Jan 18, 2012 at 10:02 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Wed, Jan 18, 2012 at 12:54 PM, Henrik Johansen henrik.s.johansen@veloxit.no wrote:
On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen henrik.s.johansen@veloxit.no wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
So rather than:
Object setIdentityHashTo: aNumber <primitive: 161>
you have:
Symbol >> initialize self deriveIdentityHashFrom: self hash
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
and similar for Mariano's Proxy class.
Cheers, Henry
Good idea!
Wait...I am slower than Eliot ;) so...deriveIdentityHashFrom: should be with the primitive call, right? otherwise I am lost.
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*" <primitive: 161>
Yes, exactly like it is in his change set example which I'd missed :/
Cheers, Henry
+1 this is the idea :)
On Jan 18, 2012, at 9:54 PM, Henrik Johansen wrote:
On Jan 18, 2012, at 8:02 50PM, Eliot Miranda wrote:
On Wed, Jan 18, 2012 at 2:25 AM, Henrik Johansen henrik.s.johansen@veloxit.no wrote:
I really don't see what good could come of it being available in general…
I can think of one good use, which my file tried to illustrate. If Symbol instances identity hashes were derived from their string hash then they would be hashed the same in all images. One can take advantage of this in e.g. method dictionary layout and hence binary class loading. This happens in two steps.
With modern machines, where linear search through a dictionary is fast, and with a JIT with inline cacheing (and even an interpreter with a large method lookup cache), where method dictionaries are not looked at much, one can save a significant amount of space by making method dictionaries flat pair-wise arrays of selector, method. Most method dictionaries are small and linear search is faster than fetching the Symbol's identity hash and doing a hash probe. By ordering method dictionaries by selector identityHash, very large method dictionaries such as Object's are indexed using binary search. We saved about 8% of the image size in VisualWorks by moving to this representation (one saves on eliminating the nils in the selector vector and the value vector, and in eliminating the value vector/method array, hence saving its header space; you still need the space for the method). [The savings in Squeak look to be much less; I just found that he same overhead in the 4.3 trunk image is only ~ 1.7%].
Now, if in addition selector identityHashes are deterministic, derived from their string hash, then one does not need to rehash/reorder a method dictionary when loading it from a binary stream (e.g. Fuel), which is again a win.
Now, I'm not suggesting we do either of these things now, but making Symbol identity hashes deterministic, derived from their string hash, can enable significant optimisation further down the road.
And I wasn't suggesting it is a bad thing to use in all cases, but rather protesting having a method in the image screaming "use me if you need to change identityHash for whatever reason!" There's just no good general comment to put there on when it might be a good idea, and "Don't use this unless you know what you're doing!" never seems to stop anyone (well, speaking on my own behalf at least…)
I'd rather see usage defined on a case-by-case basis, where you can more explicitly comment why using it in this particular case is a good idea, like you wrote above, and what Mariano mentioned for proxies.
So rather than:
Object setIdentityHashTo: aNumber <primitive: 161>
you have:
Symbol >> initialize self deriveIdentityHashFrom: self hash
Symbol >> deriveIdentityHashFrom: aNumber "This should ONLY be called as part of object initialization!" "Symbols benefit from not using the default identityHash by *insert Eliots explanation here*"
and similar for Mariano's Proxy class.
Cheers, Henry
vm-dev@lists.squeakfoundation.org