New subject: Pocket PC Performance

29 Dec 2002


      Hi Torge,
If you're right with your assumption the difference should be measurable
from within the VM with different images. I don't know if the devices
we're talking about have something equivalent to the RDTSC instruction
on i386 processors but if there's a _really_ cheap way of measuring
sub-microsecond units (RDTSC, for example, measures clock cycles) then
it might be really worthwhile to attribute a VM and see if you can find
out a difference (e.g., time spent in critical areas such as "full"
method lookup as a percentage of overall time spent).
The problem I have with your measurements is that (I think) they are not
really giving you any "good enough" evidence to make a case here. Even
if it is true that faster images show smaller lookup lengths the
differences could still be attributed to many other factors - lots of
things have changed and chasing VM inefficiencies is typically very hard
if you don't have any "hard numbers" to go along with.
Personally, my feeling is "half and half" here. Yes, there could be a
problem with the mcache size as well as the speed of the full method
lookup. But then, it's _really_ hard to tell without hard numbers.
Cheers,
  - Andreas
...
-----Original Message-----
From: squeak-dev-admin@lists.squeakfoundation.org 
[mailto:squeak-dev-admin@lists.squeakfoundation.org] On 
Behalf Of Torge.Husfeldt@gmx.de
Sent: Sunday, December 29, 2002 6:38 PM
To: squeak-dev@lists.squeakfoundation.org
Cc: squeak-dev@lists.squeakfoundation.org
Subject: Re: Pocket PC Performance
Hi All,
Can someone who encounters the performance problems mentioned in this
thread
please try out the following code snippets and report on the outcome?!
First try in a workspace:
| lookupLengths |
lookupLengths _ SortedCollection new.
Behavior allSubInstancesDo:[:class | | md |
   md _ class methodDict.
   lookupLengths addAll:(md keys asSortedCollection collect:[:sel |
   		(((md scanFor: sel) - sel identityHash) 
\ md basicSize) -> (class ->
sel)])
   ].
lookupLengths asBag sortedCounts inspect.
lookupLenghts last:100 inspect.
This will give you two inspectors.
The first will show the sorted counts of a bag which entries should be
interpreted the following:
#occurences -> #lookupLength -> sampleClass -> sampleSelector
Please report on the differences between a slow image and an 
acceptable
image (preferrably on the same system)
The second will give you the details of the 100 Methods with 
the highest
lookupLenghts.
Please look swiftly over this list if you can detect any Morphic
specific selectors with
long lookup lenghts.
The second thing i want you to try is to grow all your
MethodDictionaries that have 
exessive lookupLenghts. The following code snippet will do 
this for you.
| lookupLengths |
Behavior allSubInstancesDo:[:class | | md |
   md _ class methodDict.
   md isEmpty ifFalse:[
   	lookupLengths _ SortedCollection new.
   	lookupLengths addAll:(md keys 
asSortedCollection collect:[:sel |
   			(((md scanFor: sel) - sel 
identityHash) \ md basicSize) -> (class
-> sel)]).
   	(lookupLengths last key > 9) ifTrue:[md grow]]]
Please report if your image "feels" any swifter after this operation.
Note#0:
Be sure not to have any PackagePaneBrowser (aka 5-pane browser)
open when you do your tests because these beasts will stop all morphic
updating (and maybe event dispatch) for up to one second every second
on a slow machine. This is due to a design bug which can very 
easily be
avoided using a changeset i once posted to the list but don't have the
patience
to dig up right now.
Note#1:
These operations might take a _very_ long time (especially on a slow
system)
so be pationent) (on my 1700+ it was in the second range but since
you're
especially encountering problems on slow systems you will probably do
the
tests there, too -- so don't say i didn't warn you ).
Note#2:
LookupLenghts stand for the amount of probes the vm has to do in _a
single
method dictionary_ to find a method corresponding to a 
selector. This is
just a
minimum measure because it doesn't count  the number of probes spent
while
following the superclass chain. These numbers are typically small for
almost empty
method dictionaries but may become huge when all superclasses 
have long
probe chains
and the selector is only implemented in ProtoObject
Note#3:
It is nowhere near guranteed that this will change anything because
lookup
lenghts aren't _supposed_ to make a difference. It is widely believed
that the
vm's lookup cache mechanism should deal with the performance hit that
would
result from long probe chains.
I have, whatsoever, two strong hints that lookupLenghts 
_might_ be part
of the
problem you encounter. These are:
Hint#1: The problem has arisen rather gradually and noone has yet been
able to
find any particular change that made the difference
hint#2: The Lookup cache (as i understand it) seems to be rather small
for as big
a system as morphic (i only saw space for 512 entries last time i
looked) and gets
flushed on several accasions (such as gc's).
Looking forward to your feedback,
Torge

RE: Pocket PC Performance

Benchmark #2: 165464ms

Benchmark #2: 154462ms

--

--