Hi all!
while playing around with code simulation recently, I found out that the following dictionary lookup consumes extremely many bytecodes:
UserInterfaceTheme current get: #return for: SHTextStylerST80.
Dictionary >> #scanFor: takes 59 (sic!) attempts to find the right item in the dictionary. In other words, almost 20% of the property table is traversed, which does not conform my idea of a hashtable.
The reason for this can be found in Association >> #hash: Since Collections-ar.304, it does not take the value of an association into consideration for the hash any longer. As UserInterfaceTheme uses Associations as keys in the Dictionary, we have a lot of "hash misses". Even though you might not notice this in practice, I noticed this indeed during debugging und simulating, I can tell you. :-)
On the other hand, reverting Association >> #hash to honor the value slows down Unicode class compileAll indeed (significantly: 60 ms vs. 2100 ms), as noted in Andreas' original comment:
Name: Collections-ar.304 Author: ar Time: 11 February 2010, 11:52:58.959 pm UUID: 9e97b360-adf1-f745-8f8c-562aa08d566e Ancestors: Collections-ar.303 Fix an age-old bug in Association>>hash which causes extreme slowdowns in compiling Unicode methods. Association>>hash does not need to hash the value; it's slow and useless.
So my question is: What would be the right fix for this situation? Is it a) Do not use Associations in Dictionarys when its value matters for the lookup; or b) Do only use Associations in Dictionarys when its value maters for the lookup? For a), the dictionary layout of UserInterfaceTheme needs to be changed, for which I am attaching a changeset that replaces Associations by Arrays there.* For b), the layout of litIndSet in Encoder needs to be changed, for instance by using Bindings instead of Associations for classPool entries. However, I'm not familiar enough with Encoder etc. to answer this question.
*The changeset will automatically update all UserInterfaceTheme instances. Benchmark:
theme := UserInterfaceTheme current. random := Random seed: 20220526. properties := (1 to: 10000) collect: [:i | theme properties keys atRandom: random]. [properties collect: [:ea | theme get: ea]] bench.
Old: 45 ms | New: 10 ms
Looking forward to your opinion: How should we treat Associations in Dictionarys? Of course, this is not release-critical. :-)
Best, Christoph
=============== Postscript (accelerateUserInterfaceTheme.2.cs) ===============
"Postscript: Update UserInterfaceTheme instances"
UserInterfaceTheme allSubInstancesDo: [:theme | | newProperties | newProperties := theme properties copyEmpty. theme properties keysAndValuesDo: [:key :value | newProperties at: ((key isKindOf: Association) ifTrue: [{key key. key value}] ifFalse: [key]) put: value]. theme instVarNamed: 'properties' put: newProperties].
=============== Diff ===============
UserInterfaceTheme>>get: {private} · ct 5/26/2022 21:15 (changed) get: keyObject "keyObject is intended to be an Association. We have two lookup strategies: 1) along the superclass chain *of the client*, 2) via a linked theme. Evaluate the result because there can be message sends stored or blocks." | k | properties at: keyObject ifPresent: [:prop | ^ prop value]. - keyObject isVariableBinding "simple key objects" + keyObject isArray "simple key objects" ifFalse: [^ self getViaLink: keyObject]. - k := keyObject key. + k := keyObject at: 1. (self getViaSuperclasses: keyObject) ifNotNil: [:prop | ^ prop]. - keyObject key: k. "restore" + keyObject at: 1 put: k. "restore" ^ self getViaLink: keyObject
UserInterfaceTheme>>get:for: {private} · ct 5/26/2022 21:02 (changed) get: propertySymbol for: scope "For convenience. Does support access to non-class keys." | aClass | aClass := (scope isNil or: [scope isBehavior]) ifTrue: [scope] ifFalse: [Smalltalk classNamed: scope].
- aClass ifNotNil: [^ self get: aClass -> propertySymbol]. + aClass ifNotNil: [^ self get: {aClass. propertySymbol}]. properties - at: scope -> propertySymbol + at: {scope. propertySymbol} ifPresent: [:prop | ^ prop value]. - ^ self getViaLink: scope -> propertySymbol + ^ self getViaLink: {scope. propertySymbol}
UserInterfaceTheme>>getViaSuperclasses: {private} · ct 5/26/2022 21:15 (changed) getViaSuperclasses: keyObject "keyObject is intended to be an Association. Find the superclass of the key of the keyObject (which will initially be the client's class) and make a new keyObject using that and the original message name, then try searching for that." "We know we're the only referencer of keyObject. Update it rather than create new ones, for performance reasons." - keyObject key: keyObject key superclass. + keyObject at: 1 put: (keyObject at: 1) superclass.
- keyObject key ifNil: [^ nil]. + (keyObject at: 1) ifNil: [^ nil]. properties at: keyObject ifPresent: [:prop | ^ prop value]. ^ self getViaSuperclasses: keyObject
UserInterfaceTheme>>set:for:to: {building} · ct 5/26/2022 21:03 (changed) set: propertySymbol for: aClassOrSymbol to: valueObject "Where aClass asks its userInterfaceTheme for propertySymbol, provide valueObject." | aClass | aClass := aClassOrSymbol isBehavior ifTrue: [aClassOrSymbol] ifFalse: [Smalltalk classNamed: aClassOrSymbol]. aClass ifNil: [^ self]. ^ self atomicUpdate: [ : props | | key | - key := aClass -> propertySymbol. + key := {aClass. propertySymbol}. valueObject ifNil: [ props removeKey: key ifAbsent: [ "already cleared, don't error" ] ] ifNotNil: [ props at: key put: valueObject ] ]
UserInterfaceThemeRequest>>doesNotUnderstand: {lookup} · ct 5/26/2022 21:02 (changed) doesNotUnderstand: aMessage "Look up the visual attribute specified by aMessage's #selector in the current theme for the current target object."
aMessage numArgs = 0 ifTrue: [ - ^ (self theme get: self target class -> aMessage selector) + ^ (self theme get: {self target class. aMessage selector}) ifNil: [(self theme respondsTo: aMessage selector) ifTrue: [self theme perform: aMessage selector] ifFalse: [nil "unset property"]]]. aMessage numArgs = 1 ifTrue: [ ^ self theme - set: self target class -> aMessage selector asSimpleGetter + set: {self target class. aMessage selector asSimpleGetter} to: aMessage arguments first]. ^ self theme perform: aMessage selector withArguments: aMessage arguments.
--- Sent from Squeak Inbox Talk ["accelerateUserInterfaceTheme.2.cs"]
Hi Christoph,
On Thu, 26 May 2022, christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi all!
while playing around with code simulation recently, I found out that the following dictionary lookup consumes extremely many bytecodes:
UserInterfaceTheme current get: #return for: SHTextStylerST80.
Dictionary >> #scanFor: takes 59 (sic!) attempts to find the right item in the dictionary. In other words, almost 20% of the property table is traversed, which does not conform my idea of a hashtable.
The reason for this can be found in Association >> #hash: Since Collections-ar.304, it does not take the value of an association into consideration for the hash any longer. As UserInterfaceTheme uses Associations as keys in the Dictionary, we have a lot of "hash misses". Even though you might not notice this in practice, I noticed this indeed during debugging und simulating, I can tell you. :-)
On the other hand, reverting Association >> #hash to honor the value slows down Unicode class compileAll indeed (significantly: 60 ms vs. 2100 ms), as noted in Andreas' original comment:
Name: Collections-ar.304 Author: ar Time: 11 February 2010, 11:52:58.959 pm UUID: 9e97b360-adf1-f745-8f8c-562aa08d566e Ancestors: Collections-ar.303 Fix an age-old bug in Association>>hash which causes extreme slowdowns in compiling Unicode methods. Association>>hash does not need to hash the value; it's slow and useless.
So my question is: What would be the right fix for this situation? Is it a) Do not use Associations in Dictionarys when its value matters for the lookup; or b) Do only use Associations in Dictionarys when its value maters for the lookup?
It's the user's (the developer who decided to use a hashed collection) responsibility to provide keys that have a good hash distribution. If it's not possible with the default hash function, a PluggableDictionary should be used with a better one.
In this case, however, I would use Arrays as tuples instead of Associations because not all the keys are Associations in that dictionary, and Array's #hash is good enough here. Changing Association >> #hash would have too many side effects for such a localized problem to fix.
For a), the dictionary layout of UserInterfaceTheme needs to be changed, for which I am attaching a changeset that replaces Associations by Arrays there.* For b), the layout of litIndSet in Encoder needs to be changed, for instance by using Bindings instead of Associations for classPool entries. However, I'm not familiar enough with Encoder etc. to answer this question.
*The changeset will automatically update all UserInterfaceTheme instances. Benchmark:
theme := UserInterfaceTheme current. random := Random seed: 20220526. properties := (1 to: 10000) collect: [:i | theme properties keys atRandom: random]. [properties collect: [:ea | theme get: ea]] bench.
Old: 45 ms | New: 10 ms
Looking forward to your opinion: How should we treat Associations in Dictionarys? Of course, this is not release-critical. :-)
Best, Christoph
=============== Postscript (accelerateUserInterfaceTheme.2.cs) ===============
"Postscript: Update UserInterfaceTheme instances"
UserInterfaceTheme allSubInstancesDo: [:theme | | newProperties | newProperties := theme properties copyEmpty. theme properties keysAndValuesDo: [:key :value | newProperties at: ((key isKindOf: Association) ifTrue: [{key key. key value}] ifFalse: [key]) put: value]. theme instVarNamed: 'properties' put: newProperties].
There's a method named #atomicUpdate:. I'm not sure if it's needed here, but the name suggests that it's better to use that to update the properties. The rest of the changes looks good to me. +1
Levente
Hi Christoph --
I agree with Levente in that we should not change Association >> #hash.
+1 on your proposed changes to UserInterfaceTheme.
Maybe just use "UserInterfaceTheme cleanUpAndReset" as a postscript. :-) At this point, we still do not expect custom stuff in there.
Best, Marcel Am 26.05.2022 23:32:27 schrieb Levente Uzonyi leves@caesar.elte.hu: Hi Christoph,
On Thu, 26 May 2022, christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi all!
while playing around with code simulation recently, I found out that the following dictionary lookup consumes extremely many bytecodes:
UserInterfaceTheme current get: #return for: SHTextStylerST80.
Dictionary >> #scanFor: takes 59 (sic!) attempts to find the right item in the dictionary. In other words, almost 20% of the property table is traversed, which does not conform my idea of a hashtable.
The reason for this can be found in Association >> #hash: Since Collections-ar.304, it does not take the value of an association into consideration for the hash any longer. As UserInterfaceTheme uses Associations as keys in the Dictionary, we have a lot of "hash misses". Even though you might not notice this in practice, I noticed this indeed during debugging und simulating, I can tell you. :-)
On the other hand, reverting Association >> #hash to honor the value slows down Unicode class compileAll indeed (significantly: 60 ms vs. 2100 ms), as noted in Andreas' original comment:
Name: Collections-ar.304 Author: ar Time: 11 February 2010, 11:52:58.959 pm UUID: 9e97b360-adf1-f745-8f8c-562aa08d566e Ancestors: Collections-ar.303 Fix an age-old bug in Association>>hash which causes extreme slowdowns in compiling Unicode methods. Association>>hash does not need to hash the value; it's slow and useless.
So my question is: What would be the right fix for this situation? Is it a) Do not use Associations in Dictionarys when its value matters for the lookup; or b) Do only use Associations in Dictionarys when its value maters for the lookup?
It's the user's (the developer who decided to use a hashed collection) responsibility to provide keys that have a good hash distribution. If it's not possible with the default hash function, a PluggableDictionary should be used with a better one.
In this case, however, I would use Arrays as tuples instead of Associations because not all the keys are Associations in that dictionary, and Array's #hash is good enough here. Changing Association >> #hash would have too many side effects for such a localized problem to fix.
For a), the dictionary layout of UserInterfaceTheme needs to be changed, for which I am attaching a changeset that replaces Associations by Arrays there.* For b), the layout of litIndSet in Encoder needs to be changed, for instance by using Bindings instead of Associations for classPool entries. However, I'm not familiar enough with Encoder etc. to answer this question.
*The changeset will automatically update all UserInterfaceTheme instances. Benchmark:
theme := UserInterfaceTheme current. random := Random seed: 20220526. properties := (1 to: 10000) collect: [:i | theme properties keys atRandom: random]. [properties collect: [:ea | theme get: ea]] bench.
Old: 45 ms | New: 10 ms
Looking forward to your opinion: How should we treat Associations in Dictionarys? Of course, this is not release-critical. :-)
Best, Christoph
=============== Postscript (accelerateUserInterfaceTheme.2.cs) ===============
"Postscript: Update UserInterfaceTheme instances"
UserInterfaceTheme allSubInstancesDo: [:theme | | newProperties | newProperties := theme properties copyEmpty. theme properties keysAndValuesDo: [:key :value | newProperties at: ((key isKindOf: Association) ifTrue: [{key key. key value}] ifFalse: [key]) put: value]. theme instVarNamed: 'properties' put: newProperties].
There's a method named #atomicUpdate:. I'm not sure if it's needed here, but the name suggests that it's better to use that to update the properties. The rest of the changes looks good to me. +1
Levente
Hi Levente, Hi Marcel,
thank you for your feedback. Revised and merged via System-ct.1367.
(I decided against "UserInterfaceTheme cleanUpAndReset" because I used myself to modify my themes from time to time and would not like to revert such changes when avoidable.)
Best, Christoph
--- Sent from Squeak Inbox Talk
On 2022-07-07T10:23:22+02:00, marcel.taeumel@hpi.de wrote:
Hi Christoph --
I agree with Levente in that we should not change Association >> #hash.
+1 on your proposed changes to UserInterfaceTheme.
Maybe just use "UserInterfaceTheme cleanUpAndReset" as a postscript. :-) At this point, we still do not expect custom stuff in there.
Best, Marcel Am 26.05.2022 23:32:27 schrieb Levente Uzonyi <leves at caesar.elte.hu>: Hi Christoph,
On Thu, 26 May 2022, christoph.thiede at student.hpi.uni-potsdam.de wrote:
Hi all!
while playing around with code simulation recently, I found out that the following dictionary lookup consumes extremely many bytecodes:
UserInterfaceTheme current get: #return for: SHTextStylerST80.
Dictionary >> #scanFor: takes 59 (sic!) attempts to find the right item in the dictionary. In other words, almost 20% of the property table is traversed, which does not conform my idea of a hashtable.
The reason for this can be found in Association >> #hash: Since Collections-ar.304, it does not take the value of an association into consideration for the hash any longer. As UserInterfaceTheme uses Associations as keys in the Dictionary, we have a lot of "hash misses". Even though you might not notice this in practice, I noticed this indeed during debugging und simulating, I can tell you. :-)
On the other hand, reverting Association >> #hash to honor the value slows down Unicode class compileAll indeed (significantly: 60 ms vs. 2100 ms), as noted in Andreas' original comment:
Name: Collections-ar.304 Author: ar Time: 11 February 2010, 11:52:58.959 pm UUID: 9e97b360-adf1-f745-8f8c-562aa08d566e Ancestors: Collections-ar.303 Fix an age-old bug in Association>>hash which causes extreme slowdowns in compiling Unicode methods. Association>>hash does not need to hash the value; it's slow and useless.
So my question is: What would be the right fix for this situation? Is it a) Do not use Associations in Dictionarys when its value matters for the lookup; or b) Do only use Associations in Dictionarys when its value maters for the lookup?
It's the user's (the developer who decided to use a hashed collection) responsibility to provide keys that have a good hash distribution. If it's not possible with the default hash function, a PluggableDictionary should be used with a better one.
In this case, however, I would use Arrays as tuples instead of Associations because not all the keys are Associations in that dictionary, and Array's #hash is good enough here. Changing Association >> #hash would have too many side effects for such a localized problem to fix.
For a), the dictionary layout of UserInterfaceTheme needs to be changed, for which I am attaching a changeset that replaces Associations by Arrays there.* For b), the layout of litIndSet in Encoder needs to be changed, for instance by using Bindings instead of Associations for classPool entries. However, I'm not familiar enough with Encoder etc. to answer this question.
*The changeset will automatically update all UserInterfaceTheme instances. Benchmark:
theme := UserInterfaceTheme current. random := Random seed: 20220526. properties := (1 to: 10000) collect: [:i | theme properties keys atRandom: random]. [properties collect: [:ea | theme get: ea]] bench.
Old: 45 ms | New: 10 ms
Looking forward to your opinion: How should we treat Associations in Dictionarys? Of course, this is not release-critical. :-)
Best, Christoph
=============== Postscript (accelerateUserInterfaceTheme.2.cs) ===============
"Postscript: Update UserInterfaceTheme instances"
UserInterfaceTheme allSubInstancesDo: [:theme | | newProperties | newProperties := theme properties copyEmpty. theme properties keysAndValuesDo: [:key :value | newProperties at: ((key isKindOf: Association) ifTrue: [{key key. key value}] ifFalse: [key]) put: value]. theme instVarNamed: 'properties' put: newProperties].
There's a method named #atomicUpdate:. I'm not sure if it's needed here, but the name suggests that it's better to use that to update the properties. The rest of the changes looks good to me. +1
Levente
squeak-dev@lists.squeakfoundation.org