Hi Tom,

On Fri, Mar 24, 2023 at 11:04 AM Tom Braun <me@tom-braun.de> wrote:
 
Hi all,

I am new to the list, so I couldn’t answer directly to Minifying Woes (or at least didn’t know how to).

I identified 75 VM intern objects in Tom’s (tobe) image, by loading it in the simulator and setting a halt after loading the image.
With the following code:

| collection |
collection := OrderedCollection new.
self allOldSpaceEntitiesDo: [:obj | (((self classIndexOf: obj) < self lastClassIndexPun) and: (self isImmediate: obj) not) ifTrue: [ collection add: obj] ].
(collection sorted: [:a :b | (self bytesInBody: a) > (self bytesInBody: b)]) collect: [:ea | ea hex -> (self bytesInBody: ea)]

Right.  A minor thing is that you don't need isImmediate: because by definition allOldSpaceEntitiesDo enumerates over objects, not immediates.  Instead you could say

self allOldSpaceEntitiesDo: [:obj || ci | ci := self classIndexOf: obj. (ci isZero or: [ci between: self firstClassIndexPun and: self lastClassIndexPun]) ifTrue: [collection add: obj] ]

or more simply

self allOldSpaceEntitiesDo: [:obj || ci | ((self classIndexOf: obj) <= self lastClassIndexPun ifTrue: [collection add: obj] ]

I get the following 75 objects:

1. A free chunk after all other objects. Can be ignored for the sake of minimising image size
2. Remembered set -> 1048592 byte
3. hiddenRootsObj  -> 32848 byte
4 - 61. Pages of the mark and weakling stack -> respectively 32752 byte
62 - 74. arrays of the class table -> respectively 8208 byte
75. specialObjectsOop -> 520 byte

As far as I can judge the stack pages and remembered set could be removed from the image to minimize it further.
If I understand correctly the StackPages could be removed by using the SpurImagePreener.

I thought that this, in SpurImagePreener>>cloneObjects, would prevent any mark stack/ephemeron stack pages getting cloned.  Looks like I'm wrong.

              (self shouldClone: obj) ifTrue:
                    
[self cloneObject: obj]

shouldClone: obj
    ^(sourceHeap isValidObjStackPage: obj) not

shouldClone: might be as simple as

shouldClone: obj
    
| classIndex |
    
classIndex := self classIndexOf: obj.
    
classIndex = 0 ifTrue: [^false]. "free objects have a class index of 0"
    
(classIndex between: self firstClassIndexPun and: self lastClassIndexPun) ifFalse:
        
[^true].
    
"The hiddenRootsObject must be cloned; the remembered set must be cloned (but may be reduced in size); the classTable pages must be cloned. anything else can be discarded."
    
^obj = sourceHeap hiddenRootsObject
    
or: [obj = sourceHeap rememberedSetObj
    
or: [classIndex = self arrayClassIndexPun]] "class table pages"

The remembered set 
Could be removed in the SpurImagePreener too (when the VM initialises the memory it initialises a new remembered set too, if it is nil).

Cool; I wasn't sure if it was initialized on start-up. It does seem to be. We need to check that the old one gets collected.  In fact, the old one should be used if it exists, because its size is a good predictor of how big it needs to be.
After a quick read of the SpurImagePreener I didn’t see that the remembered set gets removed.

Right. I had forgotten to do this.

When I tried using the preener it resulted in an unusable image (both simulator and compiled VM couldn’t load it).
I tried both:

SpurImagePreener new 
preenImage: '/Users/tombraun/Desktop/Squeak6.0-22104-64bit copy.image'

SpurImagePreener new 
writeDefaultHeader: true;
savedWindowSize: 1@1;
preenImage: '/Users/tombraun/Desktop/Squeak6.0-22104-64bit copy.image'

Could be a me problem, as I did some changes to the memory management in my VMMaker image, although this shouldn’t influence 
the preener….

Well, it can be fixed :-)
On this note @Eliot: why the decision to make the object stacks and the remembered set VM managed objects instead of allocating them 
separately? 
1. We don’t need to keep them in a snapshot. All object stacks are empty after GC and as we flushed the new space pre snapshot
The remembered set shouldn't need to be persisted too. 
2. During GC we simply mark all stack pages and keep them alive. When we at least 
freed empty pages (after a limit, to prevent the running VM from having to allocate too many pages every GC?) I would see the value.

What did I overlook or does it simply have historical reasons?

If one has good to high quality machinery for a heap manager then it makes sense to use it for all allocations, not just those of the mutator.  This includes in the footprint measurements etc memory usage, instead of hiding it in the C allocator. It also means that it is easy to find these allocations, as above, whereas if allocated in C one could easily lose space if there was a leak, etc.  So in my opinion (and in others') high quality heap managers should manage as much of their internal storage as possible.
 
Best,
Tom (WoC)
 
_,,,^..^,,,_
best, Eliot