Make image startup slightly faster.
a) Don't rewrite all oops to clear the mark bit, only rewrite the ones that have the mark bit set. b) Don't rewrite all oops to change oops address if the offset is zero which is the case on the mac when the same vm reads a recently saved image because of how the memory map is located. Unknown if this is the case on wintel. c) Don't rescan the entire image a second time to null out compiled methods with an external primitive call, remember those oops on the first scan, then later null out the 320 or so contexts, versus rescaning 260 thousands oops
Now one of the 'ugly' things I did here to avoid altering the Interpreter class definition and having 16K of memory sucked up by a global one-time use table is to allocate a local 4096 element Array in ObjectMemory>>initializeObjectMemory: which gets passed to an altered adjustAllOopsBy: which remembers the methods oops that have refer to external primitives as it's looking for Oops with the Mark Bit set.
That array with the stored count is returned and at the end of initializeObjectMemory we decide to invoke the old flushExternalPrimitives if the count is >= 4095 or in my testing with the base image since we only have 320 oops that match it means just invoking flushExternalPrimitiveOf: on the remembered oops.
At startup time we reduce the CPU time by 8% based on the changes to initializeObjectMemory:, and remove another 4% based on a no-longer required rescan by flushExternalPrimitives.
Although these changes reduce powerpc startup time by just a little it should make some noticeable changes for 100MB images, or on machines that are *much* slower.
Other changes go into the mac source code to reorganize how the menu bar is manipulated and not to start the 1/60 clock pthread until it's really required. That shaves a bunch of clock time off and now startup is much more snappy.
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
Now one of the 'ugly' things I did here to avoid altering the Interpreter class definition and having 16K of memory sucked up by a global one-time use table is to allocate a local 4096 element Array in ObjectMemory>>initializeObjectMemory: which gets passed to an altered adjustAllOopsBy: which remembers the methods oops that have refer to external primitives as it's looking for Oops with the Mark Bit set.
Yuck! Why don't you grab it from the end of the object memory itself? E.g., simply adjusting endOfMemory to just after the end of the OM will give you _lots_ of space for recording the CMs. Once the scan is complete you set it back and off you go. No need for nasty hacks in C methods.
Cheers, - Andreas
On Wednesday, August 6, 2003, at 01:26 PM, Andreas Raab wrote:
Yuck! Why don't you grab it from the end of the object memory itself? E.g., simply adjusting endOfMemory to just after the end of the OM will give you _lots_ of space for recording the CMs
Ah, hadn't considered that, I'll redo the change set.
We got at least 100,000 bytes there (usually mb more in reality) and I only need 16k or so. -- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
Hi John,
You know what ... let's fix this problem for real. I've spent a bit of time looking through what happens and I'm certain now that the root bit problem only affects the active and the home context. If it were any different we would have been crashing all the time as the original problem showed quite nicely (it was _very_ reliable once you triggered a store into activeCtx ;-)
I've just been doing a little cleanup anyway (there is so much code duplication between primitiveSnapshot and primitiveSnapshotEmbedded) so I'll just add the external prim flushing to it and clean out the root bit. The root bit in the active context can act as the trigger for doing the full cleanup otherwise we rely on a cleanly saved image. BTW, I have checked it and the root bit of the active context _is_ set in all Squeak images back to 1.1 ;-) so we won't have any nasty surprises.
So do we have any other cleanup we'd like to do upon image save? This is a good time to mention it.
Cheers, - Andreas
-----Original Message----- From: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] On Behalf Of John M McIntosh Sent: Wednesday, August 06, 2003 11:08 PM To: The general-purpose Squeak developers list Subject: Re: [ENH][VM] faster image startup
On Wednesday, August 6, 2003, at 01:26 PM, Andreas Raab wrote:
Yuck! Why don't you grab it from the end of the object
memory itself?
E.g., simply adjusting endOfMemory to just after the end of the
OM will give
you _lots_ of space for recording the CMs
Ah, hadn't considered that, I'll redo the change set.
We got at least 100,000 bytes there (usually mb more in reality) and I only need 16k or so.
--
==========
John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
==============================================================
===
Andreas had submitted a change set which went into 3.7.x which was a revision of my faster image startup logic.
At OOPSLA I noted that totalObjectCount gets set to NULL/0 if we do not need to adjust the OOPS starting location. I had thought this was a problem, but in testing this morning I realized that although this number is used to re-calculate a end of memory location to reserved space for forwarding blocks the fact that *most* VM uses a floating allocation for memory end, versus memory reserved, say 1GB for unix, means the issue becomes difficult to understand since the GC logic attempts to keep N MB free for youngSpace meaning about 500,000 entries are free, and of course on the first full GC the totoalObjectCount gets reset.
However to ensure behavior is the same as in the past, I'll suggest we just return a constant 300,000 which maps to roughly the objects found for the 3.7 image. Which I think *is* needed for VMs that have a constant memory size for image allocation.
Perhaps a bit more study is needed I'm not quite sure about the relationships here between endOfMemory, memoryLimit, and the attempt to deal with reserving space for forward tables, and to set up free space size based on the targeted value.
On Aug 6, 2003, at 2:21 PM, Andreas Raab wrote:
Hi John,
You know what ... let's fix this problem for real. I've spent a bit of time looking through what happens and I'm certain now that the root bit problem only affects the active and the home context. If it were any different we would have been crashing all the time as the original problem showed quite nicely (it was _very_ reliable once you triggered a store into activeCtx ;-)
I've just been doing a little cleanup anyway (there is so much code duplication between primitiveSnapshot and primitiveSnapshotEmbedded) so I'll just add the external prim flushing to it and clean out the root bit. The root bit in the active context can act as the trigger for doing the full cleanup otherwise we rely on a cleanly saved image. BTW, I have checked it and the root bit of the active context _is_ set in all Squeak images back to 1.1 ;-) so we won't have any nasty surprises.
So do we have any other cleanup we'd like to do upon image save? This is a good time to mention it.
Cheers,
- Andreas--
======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
< I'm a bug-fixing machine! >
This post brought to you by the BugFixArchiveViewer, a handy tool that makes it easy to comment on proposed fixes and enhancements for Squeak. For more information, check out the Web page for the BugFixArchiveViewer project: http://minnow.cc.gatech.edu/squeak/3214
< I'm a bug-fixing machine! >
Incorporated into candidate stream for 3.7a VMMaker
tim -- Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Strange OpCodes: PSP: Push Stack Pointer
squeak-dev@lists.squeakfoundation.org