On Mon, Jan 13, 2014 at 09:26:31AM -0800, Eliot Miranda wrote:
Hi David,
On Sun, Jan 12, 2014 at 5:13 PM, David T. Lewis lewis@mail.msen.com wrote:
On Sun, Jan 12, 2014 at 09:53:27PM +0100, Bert Freudenberg wrote:
On 12.01.2014, at 20:42, David T. Lewis lewis@mail.msen.com wrote:
It is worth noting that allObjectsDo: relies on assumptions about how the objects memory works internally. It requires that #someObject will always answer the object at the lowest address in the object memory,
and
also that a newly allocated object will always be placed at a higher address location than all other objects. Either of these assumptions is likely to be a problem as new and better object memories and garbage collectors are implemented.
Dave
Right (as Eliot's vm-dev post shows).
So IMHO the only sensible semantics of allObjectsDo: is as in
"allObjects do:" -
which might be implemented as a primitive in some VMs soonish. It
*should not*
enumerate objects created after calling the method.
On Sun, Jan 12, 2014 at 12:01:00PM -0800, Eliot Miranda wrote:
The bug is in implementing allObjects in terms of someObject and
nextObject
in the first place. It's cheap and cheerful but horribly error-prone and restrictive. It's cheap because the collection of objects doesn't have
to
be created, and on the original 16-bit Smalltalk machines that was really important. It's horribly restrictive because it assumes much about the implementation.
Before closures a sentinel wasn't even needed because enumerating the
block
didn't create a new object (the block context was reused). So the code
had
to be rewritten just to support closures.
Spur has a generation scavenger operating in a distinct new space and
that
doesn't jive well with a consistent ordering at all. So far the system
is
limping along by tenuring all objects on someObject and someInstance (so that newSpace is either empty, or doesn't contain any instances of a specific class) and having nextObject enumerate only objects in oldSpace.
But I think now we can afford a primitive that answers all the objects (remember that average object size means that such a collection will be ~ 10% of the heap, average object size in Squeak V3 is about 10.6 words).
At
least that's what Spur will do, along with an allInstancesOf: primitive. And then the become example won't cause any problems at all. Far more reliable. I suppose there are circumstances when enumerating without a container is the only feasible approach, but VisualWorks has got along
with
only an allObjects primitive for a long time now. I suspect we can too.
Implementation attached. Works on interpreter VM, not yet tested on Cog but it should be ok there also. If no objections or better suggestions I will commit it to VMMaker tomorrow.
InterpreterPrimitives>>primitiveAllObjects "Answer an array of all objects that exist when the primitive is called, excluding those that may be garbage collected as a side effect of allocating the result array. Multiple references to nil in the last slots of the array are an indication that garbage collection occured, such that some of the unreferenced objects that existed at the time of calling the primitive no longer exist. Sender is responsible for handling multiple references to nil in the result array."
Instead of filling the unused slots with nil or 0, I think you should shorten the object so that it contains each object only once, and contains only the objects. Cog contains some code for shortening. See [New]ObjectMemory>>shorten:toIndexableSize:.
That would be a better solution. However, I cannot offer an implementation in the near term because of:
shorten: obj toIndexableSize: nSlots "Currently this works for pointer objects only, and is almost certainly wrong for 64 bits."
Given that this is currently intended for pointer objects, it is probably fairly straightforward to get it working on the 64-bit object memory. In fact, it might already work as written. But I think that it will take some time to test so it's not going to happen tonight.
We could consider a variation on Bert's suggestion, in which the result array might have trailing zeros if garbage collection has occurred. Later the primitive can be improved with shorten:toIndexableSize: after which the trailing zeros will never occur in practice. That would still put the burden on the image to ignore the trailing junk, so I don't know if it would be worth doing.
Dave
On Mon, Jan 13, 2014 at 3:59 PM, David T. Lewis lewis@mail.msen.com wrote:
On Mon, Jan 13, 2014 at 09:26:31AM -0800, Eliot Miranda wrote:
Hi David,
On Sun, Jan 12, 2014 at 5:13 PM, David T. Lewis lewis@mail.msen.com
wrote:
On Sun, Jan 12, 2014 at 09:53:27PM +0100, Bert Freudenberg wrote:
On 12.01.2014, at 20:42, David T. Lewis lewis@mail.msen.com wrote:
It is worth noting that allObjectsDo: relies on assumptions about
how
the objects memory works internally. It requires that #someObject
will
always answer the object at the lowest address in the object
memory,
and
also that a newly allocated object will always be placed at a
higher
address location than all other objects. Either of these
assumptions is
likely to be a problem as new and better object memories and
garbage
collectors are implemented.
Dave
Right (as Eliot's vm-dev post shows).
So IMHO the only sensible semantics of allObjectsDo: is as in
"allObjects do:" -
which might be implemented as a primitive in some VMs soonish. It
*should not*
enumerate objects created after calling the method.
On Sun, Jan 12, 2014 at 12:01:00PM -0800, Eliot Miranda wrote:
The bug is in implementing allObjects in terms of someObject and
nextObject
in the first place. It's cheap and cheerful but horribly
error-prone and
restrictive. It's cheap because the collection of objects doesn't
have
to
be created, and on the original 16-bit Smalltalk machines that was
really
important. It's horribly restrictive because it assumes much about
the
implementation.
Before closures a sentinel wasn't even needed because enumerating the
block
didn't create a new object (the block context was reused). So the
code
had
to be rewritten just to support closures.
Spur has a generation scavenger operating in a distinct new space and
that
doesn't jive well with a consistent ordering at all. So far the
system
is
limping along by tenuring all objects on someObject and someInstance
(so
that newSpace is either empty, or doesn't contain any instances of a specific class) and having nextObject enumerate only objects in
oldSpace.
But I think now we can afford a primitive that answers all the
objects
(remember that average object size means that such a collection will
be ~
10% of the heap, average object size in Squeak V3 is about 10.6
words).
At
least that's what Spur will do, along with an allInstancesOf:
primitive.
And then the become example won't cause any problems at all. Far
more
reliable. I suppose there are circumstances when enumerating
without a
container is the only feasible approach, but VisualWorks has got
along
with
only an allObjects primitive for a long time now. I suspect we can
too.
Implementation attached. Works on interpreter VM, not yet tested on
Cog but
it should be ok there also. If no objections or better suggestions I
will
commit it to VMMaker tomorrow.
InterpreterPrimitives>>primitiveAllObjects "Answer an array of all objects that exist when the primitive
is
called, excluding those that may be garbage collected as a side effect of allocating
the
result array. Multiple references to nil in the last slots of the array are an
indication
that garbage collection occured, such that some of the unreferenced objects that
existed
at the time of calling the primitive no longer exist. Sender is responsible for
handling
multiple references to nil in the result array."
Instead of filling the unused slots with nil or 0, I think you should shorten the object so that it contains each object only once, and
contains
only the objects. Cog contains some code for shortening. See [New]ObjectMemory>>shorten:toIndexableSize:.
That would be a better solution. However, I cannot offer an implementation in the near term because of:
shorten: obj toIndexableSize: nSlots "Currently this works for pointer objects only, and is almost certainly wrong for 64 bits."
Given that this is currently intended for pointer objects, it is probably fairly straightforward to get it working on the 64-bit object memory. In fact, it might already work as written. But I think that it will take some time to test so it's not going to happen tonight.
We could consider a variation on Bert's suggestion, in which the result array might have trailing zeros if garbage collection has occurred. Later the primitive can be improved with shorten:toIndexableSize: after which the trailing zeros will never occur in practice. That would still put the burden on the image to ignore the trailing junk, so I don't know if it would be worth doing.
There's no point hurrying a thing like this. Best to do it right. Take your time and answer an Array of the objects and nothing but the objects :-). We're essentially aiming at 4.6 now so there's lots of time to test.
On 13-01-2014, at 4:07 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
There's no point hurrying a thing like this. Best to do it right. Take your time and answer an Array of the objects and nothing but the objects :-). We're essentially aiming at 4.6 now so there's lots of time to test.
I think this is probably also true of the actual operation as well. Making an array of all objects in the system (uh, might it need a flag or two to specify whether not-in-memory & proxy objects are to be realised) is a big job. Maybe a full gc is in order to make sure things are clean. Maybe the gc would provide a count of objects left live so that we could make the answer array that big without a further memory scan.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: EF: Emulate Fireworks
vm-dev@lists.squeakfoundation.org