Just a quick note I would like to share.... For my PhD, I did investigate ImageSegment very very deeply:
http://dl.acm.org/citation.cfm?id=2076323 http://www.slideshare.net/MarianoMartinezPeck/2010-smalltalkspeckobject-swap...
I didn't want to write Fuel just because. I took quite a lot of time to understand how ImageSegment primitives worked. From that effort, I remember a few conclusions:
1) I found only few users of ImageSegment 2) The few users I found, were NOT using the real purpose of ImageSegment, that is, object swapping. It was used instead as an object serializer. For that, they use #writeForExportOn: which ended up using SmartRefStream for the rest of the objects. 3) I noticed I could achieve the same performance or even better with an OO serializer built at the language side, with all the benefits this means. Of course, having Cog helped here....
In the Fuel paper: http://rmod.lille.inria.fr/archives/papers/Dias12a-SPE-Fuel.pdf you can find some benchmark comparison agains IS. Also in my PhD: http://rmod.lille.inria.fr/archives/phd/PhD-2012-Martinez-Peck.pdf
Cheers,
On Mon, Oct 20, 2014 at 9:56 PM, jvuletich@dc.uba.ar wrote:
Hi Eliot,
Hi All,
I want to check my understanding of reference semantics for image
segments as I'm close to completing the Spur implementation.
Specifically
the question is whether objects reachable only through weak pointers should be included in an image segment or not.
Remember that an image segment is created from the transitive closure of an Array of root objects, the *segment roots*. i.e. we can think of an image segment as a set of objects created by tracing the object graph from the segment roots.
The segment always includes the segment roots. Except for the roots, objects are excluded from the segment that are also reachable form the roots of the system (the *system roots*, effectively the root
environment,
Smalltalk, and the stack of the current process).
Consider a weak array in the transitive closure that is not reachable
from
the system roots, and hence should be included in the segment. Objects referenced from that weak array may be in one of three categories
- reachable from the system roots (and hence not to be included in the
segment)
- *not* reachable form the system roots, but reachable from the segment
roots via strong pointers (and hence to be included in the segment)
- *not* reachable form the system roots, *not* reachable from the segment
roots via strong pointers
Should this last category be included or excluded from the segment? I think that it makes no difference, and excluding them is only an optimization. The argument is as follows. Imagine that immediately
after
loading the image segment there is a garbage collection. That garbage collection will collect all the objects in the last category as they are only reachable from the weak arrays in the segment. Hence we are free to follow weak references as if they are strong when we create the image segment, leaving it to subsequent events to reclaim those objects.
An analogous argument accounts for objects reachable from ephemerons. Is my reasoning sound? -- best, Eliot
I think you are right. But there is a risk of somehow, someone, gaining a strong reference to the object after the image segment was created, breaking our invariants when the segment is loaded again.
An object might be (not reachable / strongly reachable / weakely reachable) from system roots and / or segment roots. This gives us 9 possibilities. Six of them are easy (and I'll not go into them). The other three are tricky:
a- Not reachable from system roots. Weakely reachable from segment roots. Do not include them. It is best to run a GC before building the image segment, to get rid of them (run termination, etc). This is to avoid the risk of the object gaining somehow a strong reference after the segment is built, making the segment miss the weak ref to it. Doing this way would also mean that any objects affected by termination would be consistent, both in the image and in the segment.
b- Weakely reachable from system roots. Weakely reachable from segment roots. Do not include them. If the object manages to survive by gaining a strong ref from the system roots, the weak ref will be repaired on segment load (Am I right on this?) If the original object was included in the segment, then on segment load it would point to a duplicate object that is about to be collected (and maybe terminated?) In any case, doing this way would also mean that any objects affected by termination would be consistent, both in the image and in the segment.
c- Weakely reachable from system roots. Strongly reachable from segment roots. Do include them. It seems reasonable to run a GC and get rid of them after unloading the segment, to avoid the risk of the object gaining somehow a strong ref in the image, and being duplicated on segment load. But doing as I say means that we would be loading into the image an object that was already terminated, although in the state it had before running termination. Not really sure if this is ok. There could be some risk of objects in the segment being in some pre-termination state, with some objects in the image being in some after-termination state. In any case, this would suggest bad design... So perhaps it makes sense to throw an exception in these cases?
I hope this rant is of use.
Cheers, Juan Vuletich
On 20.10.2014, at 18:55, Mariano Martinez Peck marianopeck@gmail.com wrote:
The few users I found, were NOT using the real purpose of ImageSegment, that is, object swapping. It was used instead as an object serializer. For that, they use #writeForExportOn: which ended up using SmartRefStream for the rest of the objects.
Well, if you look closer, you will see that projects use image segments in two completely different ways. One is, as you say, for serialization, which is not the best use of image segments, agreed, especially with all the other logic wrapped around it.
But if you enable projectsSentToDisk then entering a project will swap the previous project to disk as an image segment, allowing you to have images with very large projects without having to hold all in main memory at the same time.
This uses a completely different code path and file format than regular project export. The same technique could be used to swap out arbitrary chunks of an image.
- Bert -
On Wed, Oct 22, 2014 at 2:05 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 20.10.2014, at 18:55, Mariano Martinez Peck marianopeck@gmail.com wrote:
The few users I found, were NOT using the real purpose of ImageSegment,
that is, object swapping. It was used instead as an object serializer. For that, they use #writeForExportOn: which ended up using SmartRefStream for the rest of the objects.
Well, if you look closer, you will see that projects use image segments in two completely different ways. One is, as you say, for serialization, which is not the best use of image segments, agreed, especially with all the other logic wrapped around it.
But if you enable projectsSentToDisk then entering a project will swap the previous project to disk as an image segment, allowing you to have images with very large projects without having to hold all in main memory at the same time.
This uses a completely different code path and file format than regular project export. The same technique could be used to swap out arbitrary chunks of an image.
Totally agree. So it seems we agree that the key and good part of ImageSegment is that one, swapping out, but not as a general object graph serializer.
vm-dev@lists.squeakfoundation.org