Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary), while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
Hi Igor -
Could you give us a brief explanation about what you did? The bug report doesn't say much and the benchmark below says even less :-)
Cheers, - Andreas
On 3/8/2010 1:31 PM, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2
] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary), while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
On 9 March 2010 00:32, Andreas Raab andreas.raab@gmx.de wrote:
Hi Igor -
Could you give us a brief explanation about what you did? The bug report doesn't say much and the benchmark below says even less :-)
Oh.. sorry for being brief :)
There is a little change, introduced in ObjectMemory>>finalizeReference: oop method , which:
a) once it detects an oop who having a weak references which just died, and replaced by nil b) given oop having at least 2 instance variables (non-weak pointers) c) a first instance variable points to an object which is an instance of class , registered as a special object (WeakFinalizer)
if all of these conditions met, then it simply does following:
list := oop instVarAt: 1. list class == WeakFinalizer ifTrue: [ first := list instVarAt: 1. oop instVarAt: 2 put: first. list instVarAt: 1 put: oop ]
so, in case if you have a multiple such objects, which holding a weak refs, and using the same object in 'list', and then if some (or all of them will have their weak refs gone), then these objects will be linked to that list:
head := list head. list head: object. object next: head.
This means, that list will contain only those objects, which losed a weak refs during last GC, and so, you don't have to scan all items in weak container to determine which ones if gone due to GC (like currently WeakRegistry doing).
Cheers, - Andreas
On 3/8/2010 1:31 PM, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary), while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
Oh, and this actually an implementation of idea, we discussed earlier
http://lists.squeakfoundation.org/pipermail/vm-dev/2009-April/002572.html
While its not an ephemerons, its much more easier to adopt because there is no need in making heavy changes to GC.
Thanks Igor - if you have been implementing what you've described in this email I'm all for it. Give me a bit of time to review the code since this stuff can be tricky.
One thing I do recall and that you may want to check is that testing class membership can be tricky during GC - in fact, the introduction of a weak format was due to issues with determining the class of some object correctly. Unfortunately, I do not recall the exact details but it's worthwhile going through this very carefully and make sure that one can actually rely on a) the class pointer being valid (and not substituted while traversing the object) and b) the pointer in splObjs be valid (and not substituted).
I'll give this special attention while looking through the code.
Cheers, - Andreas
On 3/8/2010 3:14 PM, Igor Stasenko wrote:
Oh, and this actually an implementation of idea, we discussed earlier
http://lists.squeakfoundation.org/pipermail/vm-dev/2009-April/002572.html
While its not an ephemerons, its much more easier to adopt because there is no need in making heavy changes to GC.
On 9 March 2010 02:06, Andreas Raab andreas.raab@gmx.de wrote:
Thanks Igor - if you have been implementing what you've described in this email I'm all for it. Give me a bit of time to review the code since this stuff can be tricky.
One thing I do recall and that you may want to check is that testing class membership can be tricky during GC - in fact, the introduction of a weak format was due to issues with determining the class of some object correctly. Unfortunately, I do not recall the exact details but it's worthwhile going through this very carefully and make sure that one can actually rely on a) the class pointer being valid (and not substituted while traversing the object) and b) the pointer in splObjs be valid (and not substituted).
I'll give this special attention while looking through the code.
Yes, this is a biggest of my concern.
Also maybe i should use a #longAt:put: , or uncheckedXXX put: (forgot the selector) instead of self storePointer: 1 ofObject: oop withValue:
because storePointer using the possibleRootStoreIntoValue(), which could have unwanted effects :)
Maybe it could help you with analyzis: - the code touching a strong references , where all of them is already marked as reachable (because oop is reachable, and hence the list ivar, and hence the list's 'first' ivar). So, its only rearranging them a bit, but doesn't making none of them unreachable, or even worse - making an already non-reachable object be reachable again.
- the weak finalization check runs during sweep phase before compaction. Which means, that objects is not changing their locations during that phase.
Cheers, - Andreas
On 3/8/2010 3:14 PM, Igor Stasenko wrote:
Oh, and this actually an implementation of idea, we discussed earlier
http://lists.squeakfoundation.org/pipermail/vm-dev/2009-April/002572.html
While its not an ephemerons, its much more easier to adopt because there is no need in making heavy changes to GC.
On 9 March 2010 02:45, Igor Stasenko siguctua@gmail.com wrote:
On 9 March 2010 02:06, Andreas Raab andreas.raab@gmx.de wrote:
Thanks Igor - if you have been implementing what you've described in this email I'm all for it. Give me a bit of time to review the code since this stuff can be tricky.
One thing I do recall and that you may want to check is that testing class membership can be tricky during GC - in fact, the introduction of a weak format was due to issues with determining the class of some object correctly. Unfortunately, I do not recall the exact details but it's worthwhile going through this very carefully and make sure that one can actually rely on a) the class pointer being valid (and not substituted while traversing the object) and b) the pointer in splObjs be valid (and not substituted).
I'll give this special attention while looking through the code.
Yes, this is a biggest of my concern.
Also maybe i should use a #longAt:put: , or uncheckedXXX put: (forgot the selector) instead of self storePointer: 1 ofObject: oop withValue:
because storePointer using the possibleRootStoreIntoValue(), which could have unwanted effects :)
Maybe it could help you with analyzis: - the code touching a strong references , where all of them is already marked as reachable (because oop is reachable, and hence the list ivar, and hence the list's 'first' ivar). So, its only rearranging them a bit, but doesn't making none of them unreachable, or even worse - making an already non-reachable object be reachable again.
- the weak finalization check runs during sweep phase before
compaction. Which means, that objects is not changing their locations during that phase.
Few more word: i assuming that accessing special objects oop is safe, because of given line of code: self longAt: oop + i put: nilObj.
nilObj is a special object, extracted from special objects array for efficiency. And while it is safe to use it in that way (by replacing a weak ref by nil), then i think it is safe to use any other object from special objects array. But maybe i wrong :)
Cheers, - Andreas
On 3/8/2010 3:14 PM, Igor Stasenko wrote:
Oh, and this actually an implementation of idea, we discussed earlier
http://lists.squeakfoundation.org/pipermail/vm-dev/2009-April/002572.html
While its not an ephemerons, its much more easier to adopt because there is no need in making heavy changes to GC.
-- Best regards, Igor Stasenko AKA sig.
On 3/8/2010 4:45 PM, Igor Stasenko wrote:
On 9 March 2010 02:06, Andreas Raabandreas.raab@gmx.de wrote:
Thanks Igor - if you have been implementing what you've described in this email I'm all for it. Give me a bit of time to review the code since this stuff can be tricky.
One thing I do recall and that you may want to check is that testing class membership can be tricky during GC - in fact, the introduction of a weak format was due to issues with determining the class of some object correctly. Unfortunately, I do not recall the exact details but it's worthwhile going through this very carefully and make sure that one can actually rely on a) the class pointer being valid (and not substituted while traversing the object) and b) the pointer in splObjs be valid (and not substituted).
I'll give this special attention while looking through the code.
Yes, this is a biggest of my concern.
You know what, I've been looking at it and it looks safe to me. I also remembered why it was originally unsafe - it wasn't sweep phase that's the problem it is mark phase, where the headers get replaced you can't do a proper class membership test (but you need to avoid tracing weak references). During sweep, where you hook in, there is no such problem.
Also maybe i should use a #longAt:put: , or uncheckedXXX put: (forgot the selector) instead of self storePointer: 1 ofObject: oop withValue:
because storePointer using the possibleRootStoreIntoValue(), which could have unwanted effects :)
I think using storePointer:ofObject:withValue is the right thing to do. The roots table will be cleared if this happens during fullGC, but if this happens during IGC, and either list or finalizer are old, then they must be properly recorded as root.
Maybe it could help you with analyzis:
- the code touching a strong references , where all of them is
already marked as reachable (because oop is reachable, and hence the list ivar, and hence the list's 'first' ivar). So, its only rearranging them a bit, but doesn't making none of them unreachable, or even worse - making an already non-reachable object be reachable again.
Yes, that sounds right.
- the weak finalization check runs during sweep phase before
compaction. Which means, that objects is not changing their locations during that phase.
It sounds good to me. I *really* like how small the changes are - it makes it much easier to verify correctness. I think what we should be doing is build VMs to test it and run it for a while (which is easy enough between us).
BTW, one tiny thing we'll have to fix is to add a test for splObj array - the way the code is written right now it could run afoul of an old image that doesn't have anything registered as ClassWeakFinalizer.
But other than that, great job!
Cheers, - Andreas
On 9 March 2010 10:41, Andreas Raab andreas.raab@gmx.de wrote:
On 3/8/2010 4:45 PM, Igor Stasenko wrote:
On 9 March 2010 02:06, Andreas Raabandreas.raab@gmx.de wrote:
Thanks Igor - if you have been implementing what you've described in this email I'm all for it. Give me a bit of time to review the code since this stuff can be tricky.
One thing I do recall and that you may want to check is that testing class membership can be tricky during GC - in fact, the introduction of a weak format was due to issues with determining the class of some object correctly. Unfortunately, I do not recall the exact details but it's worthwhile going through this very carefully and make sure that one can actually rely on a) the class pointer being valid (and not substituted while traversing the object) and b) the pointer in splObjs be valid (and not substituted).
I'll give this special attention while looking through the code.
Yes, this is a biggest of my concern.
You know what, I've been looking at it and it looks safe to me. I also remembered why it was originally unsafe - it wasn't sweep phase that's the problem it is mark phase, where the headers get replaced you can't do a proper class membership test (but you need to avoid tracing weak references). During sweep, where you hook in, there is no such problem.
Also maybe i should use a #longAt:put: , or uncheckedXXX put: (forgot the selector) instead of self storePointer: 1 ofObject: oop withValue:
because storePointer using the possibleRootStoreIntoValue(), which could have unwanted effects :)
I think using storePointer:ofObject:withValue is the right thing to do. The roots table will be cleared if this happens during fullGC, but if this happens during IGC, and either list or finalizer are old, then they must be properly recorded as root.
Maybe it could help you with analyzis: - the code touching a strong references , where all of them is already marked as reachable (because oop is reachable, and hence the list ivar, and hence the list's 'first' ivar). So, its only rearranging them a bit, but doesn't making none of them unreachable, or even worse - making an already non-reachable object be reachable again.
Yes, that sounds right.
- the weak finalization check runs during sweep phase before
compaction. Which means, that objects is not changing their locations during that phase.
It sounds good to me. I *really* like how small the changes are - it makes it much easier to verify correctness. I think what we should be doing is build VMs to test it and run it for a while (which is easy enough between us).
BTW, one tiny thing we'll have to fix is to add a test for splObj array - the way the code is written right now it could run afoul of an old image that doesn't have anything registered as ClassWeakFinalizer.
yes, in old images it will read a value beyond the splObj oop, so, its maybe worth adding back the splObj size check before reading this value.
But then it is compared with (self fetchClassOf: listOop) == (self splObj: ClassWeakFinalizer)
so, what chances that both these values , accidentially will be same , when we reading at wrong slot of special objects array? In most cases, a word, which you reading past the object contents is a header of the following oop which can be either a free chunk, or valid object header. So, i think they are very small.
But other than that, great job!
Cheers, - Andreas
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
Levente
-- Best regards, Igor Stasenko AKA sig.
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
Levente
-- Best regards, Igor Stasenko AKA sig.
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
Levente
Levente
-- Best regards, Igor Stasenko AKA sig.
-- Best regards, Igor Stasenko AKA sig.
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
No problem, it could be added quite easily.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
why allowing user to go into absurdly direction? Why just #shouldNotImplement it? Do you see any good and practical uses of weakregistry copies?
Levente
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
No problem, it could be added quite easily.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
why allowing user to go into absurdly direction? Why just #shouldNotImplement it? Do you see any good and practical uses of weakregistry copies?
I don't see any use of it, but someone else may. We allow everyone to evaluate Object superclass: Object, even though we know that it will crash the system.
Levente
Levente
-- Best regards, Igor Stasenko AKA sig.
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
Please, review the http://bugs.squeak.org/view.php?id=7473
There are two sets of changesets - one for vmmaker and other is for language-side.
A VMMaker changeset is based on VMMaker-dtl.159.
Here the little benchmark, between old and new weak registries:
{ WeakRegistry. WeakFinalizationRegistry } collect: [:class | | registry weaklings time1 time2 | registry := class new. WeakArray removeWeakDependent: registry.
weaklings := (1 to: 100000) collect: [:i | Object new ]. time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. weaklings at: 100 put: nil. Smalltalk garbageCollect; garbageCollect. time2 := [ registry finalizeValues ] timeToRun. time1 @ time2 ] {7816@41 . 4114@0}
While its not much better at first benchmark (since using the same approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
No problem, it could be added quite easily.
while other is significant, since there is no longer need to scan a whole collection to detect an items which become a garbage.
That's great.
Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary instead of WeakIdentityKeyDictionary? Isn't a weak refs is identity-based?
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
I am also a bit wonder, why WeakRegistry has to support a copy protocol? It may lead to unpredictable behavior once you try to copy such kind of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
why allowing user to go into absurdly direction? Why just #shouldNotImplement it? Do you see any good and practical uses of weakregistry copies?
I don't see any use of it, but someone else may. We allow everyone to evaluate Object superclass: Object, even though we know that it will crash the system.
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
Levente
Levente
-- Best regards, Igor Stasenko AKA sig.
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote:
On Mon, 8 Mar 2010, Igor Stasenko wrote:
> > Please, review the > http://bugs.squeak.org/view.php?id=7473 > > There are two sets of changesets - one for vmmaker and other is for > language-side. > > A VMMaker changeset is based on VMMaker-dtl.159. > > Here the little benchmark, between old and new weak registries: > > { WeakRegistry. WeakFinalizationRegistry } collect: [:class | > | registry weaklings time1 time2 | > registry := class new. > WeakArray removeWeakDependent: registry. > > weaklings := (1 to: 100000) collect: [:i | Object new ]. > time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. > weaklings at: 100 put: nil. > Smalltalk garbageCollect; garbageCollect. > time2 := [ registry finalizeValues ] timeToRun. > time1 @ time2 > ] > {7816@41 . 4114@0} > > While its not much better at first benchmark (since using the same > approach to store objects in one dictionary),
I took a quick look and found that you omitted the semaphore from WeakFinalizationRegistry. I think it won't work this way.
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
No problem, it could be added quite easily.
> while other is significant, since there is no longer need to scan a > whole collection to detect an items which become a garbage.
That's great.
> > Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary > instead of WeakIdentityKeyDictionary? > Isn't a weak refs is identity-based? >
I found it strange too, but I think the users of WeakRegistry don't implement their own #hash and #=, though I didn't check that. Using a WeakIdentityKeyDictionary could also mean better performance in most cases.
> I am also a bit wonder, why WeakRegistry has to support a copy protocol? > It may lead to unpredictable behavior once you try to copy such kind > of container, no matter how well you protect it.
I think copy is supported because WeakRegistry is a collection. I wonder how could you get the unpredictable behavior with the current implementation.
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
why allowing user to go into absurdly direction? Why just #shouldNotImplement it? Do you see any good and practical uses of weakregistry copies?
I don't see any use of it, but someone else may. We allow everyone to evaluate Object superclass: Object, even though we know that it will crash the system.
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
Levente
Levente
Levente
-- Best regards, Igor Stasenko AKA sig.
-- Best regards, Igor Stasenko AKA sig.
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
On 9 March 2010 00:44, Levente Uzonyi leves@elte.hu wrote: > > On Mon, 8 Mar 2010, Igor Stasenko wrote: > >> >> Please, review the >> http://bugs.squeak.org/view.php?id=7473 >> >> There are two sets of changesets - one for vmmaker and other is for >> language-side. >> >> A VMMaker changeset is based on VMMaker-dtl.159. >> >> Here the little benchmark, between old and new weak registries: >> >> { WeakRegistry. WeakFinalizationRegistry } collect: [:class | >> | registry weaklings time1 time2 | >> registry := class new. >> WeakArray removeWeakDependent: registry. >> >> weaklings := (1 to: 100000) collect: [:i | Object new ]. >> time1 := [ weaklings do: [:each | registry add: each ] ] timeToRun. >> weaklings at: 100 put: nil. >> Smalltalk garbageCollect; garbageCollect. >> time2 := [ registry finalizeValues ] timeToRun. >> time1 @ time2 >> ] >> {7816@41 . 4114@0} >> >> While its not much better at first benchmark (since using the same >> approach to store objects in one dictionary), > > I took a quick look and found that you omitted the semaphore from > WeakFinalizationRegistry. I think it won't work this way. >
The point is, that new implementation don't touching the 'valueDictionary' during finalization, and therefore from this side, there is no concurrency issues.
Adding/accessing items in the list is not protected. But i think the main reason why we having a semaphore here is to protect from interrupts caused by garbage collector and finalization process. If you think we should also protect it from concurrent access , when populating it - i could add the semaphores.
I think it's a must. For example you can always open two sockets or two files concurrently. Or close one while opening another, etc.
No problem, it could be added quite easily.
>> while other is significant, since there is no longer need to scan a >> whole collection to detect an items which become a garbage. > > That's great. > >> >> Btw, i wonder, why current WeakRegistry using a WeakKeyDictionary >> instead of WeakIdentityKeyDictionary? >> Isn't a weak refs is identity-based? >> > > I found it strange too, but I think the users of WeakRegistry don't > implement their own #hash and #=, though I didn't check that. Using a > WeakIdentityKeyDictionary could also mean better performance in most cases. > >> I am also a bit wonder, why WeakRegistry has to support a copy protocol? >> It may lead to unpredictable behavior once you try to copy such kind >> of container, no matter how well you protect it. > > I think copy is supported because WeakRegistry is a collection. I wonder how > could you get the unpredictable behavior with the current implementation. >
well, potentially, it could lead to making finalization twice (by each registry object), since when you copying registry you doubling the number of references to all executor objects which is held strongly by registry.
I think it's the responsibility of the user of #copy to avoid that. And the user doesn't have to do much, because the copy is not registered in WeakArray for finalization, so sending #copy to WeakRegistry won't cause any harm IMHO.
why allowing user to go into absurdly direction? Why just #shouldNotImplement it? Do you see any good and practical uses of weakregistry copies?
I don't see any use of it, but someone else may. We allow everyone to evaluate Object superclass: Object, even though we know that it will crash the system.
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution: - add a 'finalizer' ivar to Socket/StandardFileStream - by registering a socket in registry, retrieve an instance of WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
Levente
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Levente
Levente
-- Best regards, Igor Stasenko AKA sig.
On 10 March 2010 04:34, Levente Uzonyi leves@elte.hu wrote:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
Sure, support for multiple finalizers per object could be added. But as to me, this looks like an over-engineering.
Btw, the current implementation of WeakRegistry is buggy, because of use non-identity based dictionary. Try explore contents of it, after evaluating:
10 timesRepeat: [ registry add: (Array with:10) ]. it adds only a single key/value pair into valueDictionary, while i'd expect to have 10 entries, because i adding 10 distinct array objects. The flaw in logic is easy to illustrate:
array1 := Array with: 10. array2 := Array with: 10.
registry add: array1; array2. array2 at: 1 put: 12. registry add: array2.
array1 := nil "remove strong reference to array1, while keep it for array2"
The point is, that if two disctinct objects answering true on #= , when we adding them to registry it doesn't means that they will keep to be 'equal' after we added them, because these objects can mutate. So, if one of them eventually die, other(s) may still be in use, which will lead to receiving a death notice to wrong recipient.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Yes, i'm also thought about different ways how to organize things in less memory & CPU hungry manner. The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
My implementation is not perfect, it was mainly a quick and dirty hack (you pointed on not using semaphores already). But my intent was mainly to show how to use new VM feature, i added. So, once it will be approved (i hope) and adopted, we could implement things properly.
Levente
On 10 March 2010 05:16, Igor Stasenko siguctua@gmail.com wrote:
On 10 March 2010 04:34, Levente Uzonyi leves@elte.hu wrote:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
Sure, support for multiple finalizers per object could be added. But as to me, this looks like an over-engineering.
Btw, the current implementation of WeakRegistry is buggy, because of use non-identity based dictionary. Try explore contents of it, after evaluating:
10 timesRepeat: [ registry add: (Array with:10) ]. it adds only a single key/value pair into valueDictionary, while i'd expect to have 10 entries, because i adding 10 distinct array objects. The flaw in logic is easy to illustrate:
array1 := Array with: 10. array2 := Array with: 10.
registry add: array1; array2.
oops, the line above should be:
registry add: array1; add: array2.
array2 at: 1 put: 12. registry add: array2.
array1 := nil "remove strong reference to array1, while keep it for array2"
The point is, that if two disctinct objects answering true on #= , when we adding them to registry it doesn't means that they will keep to be 'equal' after we added them, because these objects can mutate. So, if one of them eventually die, other(s) may still be in use, which will lead to receiving a death notice to wrong recipient.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Yes, i'm also thought about different ways how to organize things in less memory & CPU hungry manner. The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
My implementation is not perfect, it was mainly a quick and dirty hack (you pointed on not using semaphores already). But my intent was mainly to show how to use new VM feature, i added. So, once it will be approved (i hope) and adopted, we could implement things properly.
Levente
-- Best regards, Igor Stasenko AKA sig.
On Wed, 10 Mar 2010, Igor Stasenko wrote:
On 10 March 2010 04:34, Levente Uzonyi leves@elte.hu wrote:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
Sure, support for multiple finalizers per object could be added. But as to me, this looks like an over-engineering.
Btw, the current implementation of WeakRegistry is buggy, because of use non-identity based dictionary.
Buggy is a bit strong, it just doesn't support objects which don't implement #= as #== and #hash as #scaledIdentityHash. But I already responded to this here: http://lists.squeakfoundation.org/pipermail/vm-dev/2010-March/003960.html
Try explore contents of it, after evaluating:
10 timesRepeat: [ registry add: (Array with:10) ]. it adds only a single key/value pair into valueDictionary, while i'd expect to have 10 entries, because i adding 10 distinct array objects. The flaw in logic is easy to illustrate:
array1 := Array with: 10. array2 := Array with: 10.
registry add: array1; array2. array2 at: 1 put: 12. registry add: array2.
array1 := nil "remove strong reference to array1, while keep it for array2"
The point is, that if two disctinct objects answering true on #= , when we adding them to registry it doesn't means that they will keep to be 'equal' after we added them, because these objects can mutate. So, if one of them eventually die, other(s) may still be in use, which will lead to receiving a death notice to wrong recipient.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Yes, i'm also thought about different ways how to organize things in less memory & CPU hungry manner. The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
I didn't see any code in your changesets that changed the way how weak registries are notified when their objects are gone. Did I miss something? If not, then #finalizeValues is still sent by the finalization process of WeakArray when it's semaphore is signaled, so one just has to register it's registry there.
Levente
My implementation is not perfect, it was mainly a quick and dirty hack (you pointed on not using semaphores already). But my intent was mainly to show how to use new VM feature, i added. So, once it will be approved (i hope) and adopted, we could implement things properly.
Levente
-- Best regards, Igor Stasenko AKA sig.
2010/3/10 Levente Uzonyi leves@elte.hu:
On Wed, 10 Mar 2010, Igor Stasenko wrote:
On 10 March 2010 04:34, Levente Uzonyi leves@elte.hu wrote:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
Well, if someone cares, then he actually can make own registry class, which allows copying. But why we should care, by leaving a potential security hole open? I don't think that this is normal to rely on good manners of users and expect them to not attempt to do something wrong. In contrast, a protocols and interfaces should discourage user from abuse. At least, an author could state where it is safe to use and where is not. A copy protocol for weak registry is far from being safe. In that situation is better to generate an error, indicating that such use is not foreseen by author, rather than trying to implement something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
Sure, support for multiple finalizers per object could be added. But as to me, this looks like an over-engineering.
Btw, the current implementation of WeakRegistry is buggy, because of use non-identity based dictionary.
Buggy is a bit strong, it just doesn't support objects which don't implement #= as #== and #hash as #scaledIdentityHash. But I already responded to this here: http://lists.squeakfoundation.org/pipermail/vm-dev/2010-March/003960.html
Okay. It is not buggy. It is wrong. :)
Try explore contents of it, after evaluating:
10 timesRepeat: [ registry add: (Array with:10) ]. it adds only a single key/value pair into valueDictionary, while i'd expect to have 10 entries, because i adding 10 distinct array objects. The flaw in logic is easy to illustrate:
array1 := Array with: 10. array2 := Array with: 10.
registry add: array1; array2. array2 at: 1 put: 12. registry add: array2.
array1 := nil "remove strong reference to array1, while keep it for array2"
The point is, that if two disctinct objects answering true on #= , when we adding them to registry it doesn't means that they will keep to be 'equal' after we added them, because these objects can mutate. So, if one of them eventually die, other(s) may still be in use, which will lead to receiving a death notice to wrong recipient.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Yes, i'm also thought about different ways how to organize things in less memory & CPU hungry manner. The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
I didn't see any code in your changesets that changed the way how weak registries are notified when their objects are gone. Did I miss something? If not, then #finalizeValues is still sent by the finalization process of WeakArray when it's semaphore is signaled, so one just has to register it's registry there.
Well, looks like you paying too much attention to implementation details and don't see the whole idea behind this. Its not really matter in what way you receiving such notification , what matters is that sometimes you need to be able to receive a notification when some object become garbage. This is the reason why we having a WeakRegistry.
Levente
My implementation is not perfect, it was mainly a quick and dirty hack (you pointed on not using semaphores already). But my intent was mainly to show how to use new VM feature, i added. So, once it will be approved (i hope) and adopted, we could implement things properly.
Levente
-- Best regards, Igor Stasenko AKA sig.
On 10 March 2010 06:03, Igor Stasenko siguctua@gmail.com wrote:
2010/3/10 Levente Uzonyi leves@elte.hu:
The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
I didn't see any code in your changesets that changed the way how weak registries are notified when their objects are gone. Did I miss something? If not, then #finalizeValues is still sent by the finalization process of WeakArray when it's semaphore is signaled, so one just has to register it's registry there.
Well, looks like you paying too much attention to implementation details and don't see the whole idea behind this. Its not really matter in what way you receiving such notification , what matters is that sometimes you need to be able to receive a notification when some object become garbage. This is the reason why we having a WeakRegistry.
Think of following potential implementation:
SocketHandle>>initialize: ... .... socketHandle := self primSocketCreateNetwork: 0 type: socketType receiveBufferSize: 8000 sendBufSize: 8000 semaIndex: semaIndex readSemaIndex: readSemaIndex writeSemaIndex: writeSemaIndex.
socketHandle ifNotNil: [ self onDeathDo: [ Socket primSocketDestroyGently: socketHandle. Smalltalk unregisterExternalObject: semaphore. Smalltalk unregisterExternalObject: readSemaphore. Smalltalk unregisterExternalObject: writeSemaphore. ]
as simple as that. You just telling the system, that once socket object become garbage, evaluate the block - an argument of #onDeathDo: message.
Do such pattern convincing you that what we talking here is about notification when particular object dies? Also, you may notice that given approach is much simpler and straightforward comparing to existing one, where you have to implement own #register:, #unregister: and #finalize? We could have such protocol (#myWill: , #cancelBurialCeremony ;) ) for all Objects in system, and don't pay attention anymore to speak directly to WeakRegistry.
On Wed, 10 Mar 2010, Igor Stasenko wrote:
2010/3/10 Levente Uzonyi leves@elte.hu:
On Wed, 10 Mar 2010, Igor Stasenko wrote:
On 10 March 2010 04:34, Levente Uzonyi leves@elte.hu wrote:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
2010/3/9 Levente Uzonyi leves@elte.hu:
On Tue, 9 Mar 2010, Igor Stasenko wrote:
> Well, if someone cares, then he actually can make own registry class, > which allows copying. But why we should care, by leaving a potential > security hole open? > I don't think that this is normal to rely on good manners of users and > expect them to not attempt to do something wrong. In contrast, a > protocols and interfaces should discourage user from abuse. At least, > an author could state where it is safe to use and where is not. > A copy protocol for weak registry is far from being safe. > In that situation is better to generate an error, indicating that such > use is not foreseen by author, rather than trying to implement > something which 'possibly could work' :)
I still don't see how is it unsafe.
A simple copy is fine. A copy, which then registered in weak dependents creating a problem.
I even think that weak registry should not behave as collection at all, and having only #add: method, with no ability to remove items.
The only use of #remove: i see is in Socket and StandardFileStream, which implement #unregister: in own class side.
Now , if you look at my implementation, you could see that there is a way to completely avoid the need in removing items from a valueDictionary which is pairs of <weakref> -> <executor>.
A solution:
- add a 'finalizer' ivar to Socket/StandardFileStream
- by registering a socket in registry, retrieve an instance of
WeakFinalizerItem as a result of registration and store it into 'finalizer' ivar.
- on #destroy, simply nil-out all of the finalizer's ivars, so there
is no chances that once socket become garbage, it will trigger an executor's #finalize method, which were registered previously in registry.
- forget about removing the finalizer manually from registry, because
an instance of WeakFinalizerItem which is held by 'valueDictionary' in registry will eventually be reclaimed, once dictionary will discover that corresponding key is nil.
What you think?
I took a deeper look and found that WeakFinalizationRegistry doesn't support multiple finalizers per object. It does what WeakRegistry did before: throw away the previous finalizer and replace it with a new one.
I like the current features of WeakRegistry (removing, adding, multiple finalizers per object) and I think it's easy to modify it to use your vm support.
Sure, support for multiple finalizers per object could be added. But as to me, this looks like an over-engineering.
Btw, the current implementation of WeakRegistry is buggy, because of use non-identity based dictionary.
Buggy is a bit strong, it just doesn't support objects which don't implement #= as #== and #hash as #scaledIdentityHash. But I already responded to this here: http://lists.squeakfoundation.org/pipermail/vm-dev/2010-March/003960.html
Okay. It is not buggy. It is wrong. :)
For some edge cases. I added a fix to the trunk. :)
Try explore contents of it, after evaluating:
10 timesRepeat: [ registry add: (Array with:10) ]. it adds only a single key/value pair into valueDictionary, while i'd expect to have 10 entries, because i adding 10 distinct array objects. The flaw in logic is easy to illustrate:
array1 := Array with: 10. array2 := Array with: 10.
registry add: array1; array2. array2 at: 1 put: 12. registry add: array2.
array1 := nil "remove strong reference to array1, while keep it for array2"
The point is, that if two disctinct objects answering true on #= , when we adding them to registry it doesn't means that they will keep to be 'equal' after we added them, because these objects can mutate. So, if one of them eventually die, other(s) may still be in use, which will lead to receiving a death notice to wrong recipient.
I don't really like your suggestion for files and sockets, but it's doable.
I was thinking about a simpler registry which would use an OrderedCollection instead of a WeakKeyDictionary. It'd need a new instance variable in WeakFinalizerItem (though it can be another class/subclass too) which would store the index of the finalizer in this OrderedCollection. It wouldn't have #do:, #keys or #remove:ifAbsent: (though all of them could be implemented) and #add: wouldn't replace existing finalizers, but just add them to the registry. This would have a bit better performance, simpler implementation and less features. But if you don't need #remove:, i'm sure it'd fit your needs. (It's a bit tricky though. Your socket/file ideas wouldn't work out of the box, because all of the WeakFinalizerItems would have to be removed from the OrderedCollecion somehow to avoid leaking memory. One solution would be to add it to the list on "remove" (when replacing the ivars with nil), another would be to remove it from the OrderedCollection by index, but the items don't know their collection so it would need another ivar).
All in all, I'd keep WeakRegistry with the current features and optionally add a new simpler and faster registry if needed.
Yes, i'm also thought about different ways how to organize things in less memory & CPU hungry manner. The main role of weak registry is to receive a notification, when particular object become garbage. Obviously, for this, we need to store a reference to notification handler (also known as executor) somewhere to be able to handle it. That's why current implementation using WeakKeyDictionary. Once such notification is handled, we usually no longer need to keep this object as well (thus we have to remove/unlink it from registry).
I didn't see any code in your changesets that changed the way how weak registries are notified when their objects are gone. Did I miss something? If not, then #finalizeValues is still sent by the finalization process of WeakArray when it's semaphore is signaled, so one just has to register it's registry there.
Well, looks like you paying too much attention to implementation details and don't see the whole idea behind this. Its not really matter in what way you receiving such notification , what matters is that sometimes you need to be able to receive a notification when some object become garbage. This is the reason why we having a WeakRegistry.
I just tried to understand why your current implementation needs a WeakKeyDictionary, but I didn't.
Levente
Levente
My implementation is not perfect, it was mainly a quick and dirty hack (you pointed on not using semaphores already). But my intent was mainly to show how to use new VM feature, i added. So, once it will be approved (i hope) and adopted, we could implement things properly.
Levente
-- Best regards, Igor Stasenko AKA sig.
-- Best regards, Igor Stasenko AKA sig.
Guys, i need your opinion on this. What are you generally thinking about this idea? If it ok in general, and you want it to be in VM, then i will continue with my effort to fix all remarks, which Levente and others pointed out. But since only Levente and Andreas commented this so far, and i can't decide if this feature is accepted by others or not. If you think this is not the way to go, please explain why you think that. Sure thing i, as an author want this change to be integrated.
On Sat, Mar 13, 2010 at 3:45 PM, Igor Stasenko siguctua@gmail.com wrote:
Guys, i need your opinion on this. What are you generally thinking about this idea? If it ok in general, and you want it to be in VM, then i will continue with my effort to fix all remarks, which Levente and others pointed out. But since only Levente and Andreas commented this so far, and i can't decide if this feature is accepted by others or not. If you think this is not the way to go, please explain why you think that. Sure thing i, as an author want this change to be integrated.
I'm not sure I understand the semantics yet. Andreas and I discussed it a little today and we thought that is that it is a way of associating a finalizer with a specific object such that the such that the object is still alive when it is finalized. But now I reread your proposal it looks like the finalizer gets run only when its object has been collected.
The most important thing is for the finalization system not to be post-mortem (not run a finalization action when the object is already dead). Consider a buffered file. The finalization action should flush the buffer as well as close the file handle. If the finalization is post-mortem then one finalizes a copy of the file. Hence to ensure that the copy has the same buffer state one has to update the copy every time the real file's buffer changes state, and that's bad for performance.
If I've misunderstood the proposal then please try and write a more comprehensible specification. If however the proposal still does post-mortem finalization then IMO its not worth it and we should try and implement a pre-mortem scheme where the garbage collector detects when an object is about to be collected (usually implemented by the GC detecting when the object is only referenced from finalizers, and queueing the finalizers).
2¢
cheers Eliot
-- Best regards, Igor Stasenko AKA sig.
On 14 March 2010 03:25, Eliot Miranda eliot.miranda@gmail.com wrote:
On Sat, Mar 13, 2010 at 3:45 PM, Igor Stasenko siguctua@gmail.com wrote:
Guys, i need your opinion on this. What are you generally thinking about this idea? If it ok in general, and you want it to be in VM, then i will continue with my effort to fix all remarks, which Levente and others pointed out. But since only Levente and Andreas commented this so far, and i can't decide if this feature is accepted by others or not. If you think this is not the way to go, please explain why you think that. Sure thing i, as an author want this change to be integrated.
I'm not sure I understand the semantics yet. Andreas and I discussed it a little today and we thought that is that it is a way of associating a finalizer with a specific object such that the such that the object is still alive when it is finalized. But now I reread your proposal it looks like the finalizer gets run only when its object has been collected. The most important thing is for the finalization system not to be post-mortem (not run a finalization action when the object is already dead). Consider a buffered file. The finalization action should flush the buffer as well as close the file handle. If the finalization is post-mortem then one finalizes a copy of the file. Hence to ensure that the copy has the same buffer state one has to update the copy every time the real file's buffer changes state, and that's bad for performance.
Its not very good example to show an advantage of pre-mortem finalization over post-mortem. You could implement a correct finalization scheme of the above using post-mortem finalization, by simply keeping a buffer state in a separate object, to which both, an object (which is going to die) and finalizer (or executor) referencing:
weak ref -> file (buffer, handle) strong ref -> finalizer of file (buffer , handle)
in case , when file collected, you still having a valid reference to its buffer in finalizer, which having the very same state (since it is the same object), and so, you could flush that buffer and then close the handle without losing data.
If I've misunderstood the proposal then please try and write a more comprehensible specification. If however the proposal still does post-mortem finalization then IMO its not worth it and we should try and implement a pre-mortem scheme where the garbage collector detects when an object is about to be collected (usually implemented by the GC detecting when the object is only referenced from finalizers, and queueing the finalizers).
I think you understood it well. Its a small modification to current post-mortem finalization, which allows us to react directly only on those objects, which been collected. To describe it primitively: Suppose that you having a number of weak references, held by a single container (WeakArray or other, not really matter). Now at some point you want to find out, how many of objects in it is collected after a while (you could choose to do a check immediately after each GC, or can do it periodically - not really matters). In current implementation it is possible to do it only by scanning a full contents of container , in order to find which references replaced by nil. In scheme which i proposed, you could avoid a full scan, which is in a linear dependency from container size, and react directly only on those elements which collected. This means that if your container size is X , old scheme implies to loop through X elements to find those who collected, while new scheme is always O(n), where n is the number of elements which being collected.
2¢ cheers Eliot
-- Best regards, Igor Stasenko AKA sig.
vm-dev@lists.squeakfoundation.org