On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < stephane.ducasse@inria.fr> wrote:
I would like to be sure that we can have - bit for immutable objects - bits for experimenting.
There will be quite a few. And one will be able to steal bits from the class field if one needs fewer classes. I'm not absolutely sure of the layout yet. But for example
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index
still leaves 5 bits unused. And stealing 4 bits each from class index still leaves 64k classes. So this format is simple and provides lots of unused bits. The format field is a great idea as it combines a number of orthogonal properties in very few bits. I don't want to include odd bytes in format because I think a separate field that holds odd bytes and fixed fields is better use of space. But we can gather statistics before deciding.
Stef
On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote:
Hi guys
Here is an important topic I would like to see discussed so that we see
how we can improve and join forces.
May a mail discussion then a wiki for the summary would be good.
stef
Begin forwarded message:
From: Eliot Miranda eliot.miranda@gmail.com Subject: Re: Plan/discussion/communication around new object format Date: May 27, 2012 10:49:54 PM GMT+02:00 To: Stéphane Ducasse stephane.ducasse@inria.fr
On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse <
stephane.ducasse@inria.fr> wrote:
Hi eliot
do you have a description of the new object format you want to
introduce?
The design is in the class comment of CogMemoryManager in the Cog
VMMaker packages.
Then what is your schedule?
This is difficult. I have made a small start and should be able to
spend time on it starting soon. I want to have it finished by early next year. But it depends on work schedules etc.
I would like to see if we can allocate igor/esteban time before we run
out of money
to help on that important topic. Now the solution is unclear and I did not see any document where we can
evaluate
and plan how we can help. So do you want help on that topic? Then how
can people
contribute if any?
The first thing to do is to read the design document, to see if the
Pharo community thinks it is the right direction, and to review it, spot deficiencies etc. So please get those interested to read the class comment of CogMemoryManager in the latest VMMaker.oscog.
Here's the current version of it:
CogMemoryManager is currently a place-holder for the design of the new
Cog VM's object representation and garbage collector. The goals for the GC are
- efficient object representation a la Eliot Miranda's VisualWorks
64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse weak table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. At least for a few years 64k classes should be enough. A class is entered into the class table in the following operations:
behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table
behaviorHash is a special version of identityHash that must be
implemented in the image by any object that can function as a class (i.e. Behavior).
- more immediate classes. An immediate Character class would speed up
String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11. 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy ad makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like
| 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits |
For details see "60-bit immediate Floats" below.
- efficient scavenging. The current Squeak GC uses a slow
pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write.
- lazy become. The JIT's use of inline cacheing provides a cheap way
of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses and, because it is a hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. We have yet to determine exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See below on how we deal with becomes on objects with named inst vars. Note that we probably don't have to worry about zero-sized objects. These are unlikely to be passed through the FFI (there is nothing to pass :) ) and so will rarely be becommed. If they do, they can become slowly. Alternatively we can insist that objects are at least 16 bytes in size (see a8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate classes indices corresponding to their tag pattern (SmallInteger = 1, Character = 3, SmallFloat = 4?) we can use all the class indices from 0 to 7 for special uses, 0 = forward, and e.g. 1 = header-sized filler.
- pinning. To support a robust and easy-to-use FFI the memory manager
must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space.
- efficient old space collection. An incremental collector (a la
Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. (see free space/free list below)
- 8-byte alignment. It is advantageous for the FFI, for floating-point
access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that
- the starts of all spaces are aligned on 8-byte boundaries - object allocation rounds up the requested size to a multiple of
8 bytes
- the overflow size field is also 8 bytes
We shall probably keep the minimum object size at 16 bytes so that
there is always room for a forwarding pointer. But this implies that we will need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g.
chunkSizeOf: oop <var: #oop type: #'object *'> ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue: [OverflowSizeBytes] ifFalse: [0]) + (object slotSize * BytesPerSlot)] chunkStartOf: oop <var: #oop type: #'object *'> ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes])
For the moment we do not tackle the issue of heap growth and shrinkage
with the ability to allocate and deallocate heap segments via memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. We may revisit this but it is not a key requirement for the first implementation.
The basic approach is to use a fixed size new space and a growable old
space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space.
To provide apparent pinning in new space we rely on lazy become. Since
most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing.
To provide pinning in old space, large objects are implicitly pinned
(because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation
- as in aggregates, small objects can be used to fill-in the spaces between
karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step.
Free space in old space is organized by a free list/free tree as in
Eliot's VisualWorks 5i old space allocator. There are 64 free lists, indices 1 through 63 holding blocks of space of that size, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coallesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in a 64-bit integer so that an allocation of size 63 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads.
The incremental collector (a la Dijkstra's three colour algorithm)
collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list).
The incremental collector implies a more complex write barrier.
Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially
target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]]
"target now refers to a young object; it is a root for scavenges"
ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add newValue
to IGC's markStack for subsequent scanning"
The incremental collector does not detect already marked objects all of
whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire.
Lazy become.
As described earlier the basic idea behind lazy become is to use
corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme cannot be used on objects with named inst vars without adding checking to all inst var accesses, which we judge too expensive. Instead, when becomming objects with named inst vars, we scan all activations in the stack zone, eagerly becomming these references, and we check for corpses when faulting in a context into the stack zone. Essentially, the invariant is that there are no references to corpses from the receiver slots of stack activations. A detail is whether we allow or forbid pinning of closure indirection vectors, or scan the entire stack of each activation. Using a special class index pun for indirection vectors is a cheap way of preventing their becomming/pinning etc. Although "don't do that" (don't attempt to pin/become indirection vectors) is also an acceptable response.
60-bit immediate Floats Representation for immediate doubles, only used in the 64-bit
implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0.
So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least significant
bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf:
msb
lsb
[8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag bits]
So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb
lsb
Decode: [8expsubset][52mantissa][1s][3tags] shift away tags: [ 000 ][8expsubset][52mantissa][1s] add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent
][52mantissa]
Encode: [1s][ 11 exponent
][52mantissa]
rot sign: [ 11 exponent
][52mantissa][1s]
sub exponent offset: [ 000 ][8expsubset][52 mantissa][1s] shift: [8expsubset][52
mantissa][1s][ 000 ]
or/add tags: [8expsubset][52mantissa][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the
source) move bits through memory ( quadword = *(q64 *)&doubleVariable).
Issues: How do we avoid the Size4Bit for 64-bits? The format word encodes the
number of odd bytes, but currently has only 4 bits and hence only supports odd bytes of 0 - 3. We need odd bytes of 0 - 7. But I don't like the separate Size4Bit. Best to change the VI code and have a 5 bit format? We loose one bit but save two bits (isEphemeron and isWeak (or three, if isPointers)) for a net gain of one (or two)
Further, keep Squeak's format idea or go for separate bits? For
64-bits we need a 5 bit format field. This contrasts with isPointers, isWeak, isEphemeron, 3 odd size bits (or byte size).. format field is quite economical.
Are class indices in inline caches strong references to classes or weak
references?
If strong then they must be scanned during GC and the methodZone must
be flushed on fullGC to reclaim all classes (this looks to be a bug in the V3 Cogit).
If weak then when the class table loses references, PICs containing
freed classes must be freed and then sends to freed PICs or containing freed clases must be unlinked.
The second approach is faster; the common case is scanning the class
table, the uncommon case is freeing classes. The second approach is better; in-line caches do not prevent reclamation of classes.
Stef
-- best, Eliot
On Wed, May 30, 2012 at 10:22 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < stephane.ducasse@inria.fr> wrote:
I would like to be sure that we can have - bit for immutable objects - bits for experimenting.
There will be quite a few. And one will be able to steal bits from the class field if one needs fewer classes. I'm not absolutely sure of the layout yet. But for example
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash
and we can make it lazy, that is, we compute it not at instantiation time but rather the first time the primitive is call.
20: class index
This would probably work for a while. I think that it would be good to let an "open door" so that in the future we can just add one more word for a class pointer.
still leaves 5 bits unused. And stealing 4 bits each from class index still leaves 64k classes. So this format is simple and provides lots of unused bits. The format field is a great idea as it combines a number of orthogonal properties in very few bits. I don't want to include odd bytes in format because I think a separate field that holds odd bytes and fixed fields is better use of space. But we can gather statistics before deciding.
Stef
On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote:
Hi guys
Here is an important topic I would like to see discussed so that we see
how we can improve and join forces.
May a mail discussion then a wiki for the summary would be good.
stef
Begin forwarded message:
From: Eliot Miranda eliot.miranda@gmail.com Subject: Re: Plan/discussion/communication around new object format Date: May 27, 2012 10:49:54 PM GMT+02:00 To: Stéphane Ducasse stephane.ducasse@inria.fr
On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse <
stephane.ducasse@inria.fr> wrote:
Hi eliot
do you have a description of the new object format you want to
introduce?
The design is in the class comment of CogMemoryManager in the Cog
VMMaker packages.
Then what is your schedule?
This is difficult. I have made a small start and should be able to
spend time on it starting soon. I want to have it finished by early next year. But it depends on work schedules etc.
I would like to see if we can allocate igor/esteban time before we run
out of money
to help on that important topic. Now the solution is unclear and I did not see any document where we
can evaluate
and plan how we can help. So do you want help on that topic? Then how
can people
contribute if any?
The first thing to do is to read the design document, to see if the
Pharo community thinks it is the right direction, and to review it, spot deficiencies etc. So please get those interested to read the class comment of CogMemoryManager in the latest VMMaker.oscog.
Here's the current version of it:
CogMemoryManager is currently a place-holder for the design of the new
Cog VM's object representation and garbage collector. The goals for the GC are
- efficient object representation a la Eliot Miranda's VisualWorks
64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse weak table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. At least for a few years 64k classes should be enough. A class is entered into the class table in the following operations:
behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table
behaviorHash is a special version of identityHash that must be
implemented in the image by any object that can function as a class (i.e. Behavior).
- more immediate classes. An immediate Character class would speed up
String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11. 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy ad makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like
| 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits |
For details see "60-bit immediate Floats" below.
- efficient scavenging. The current Squeak GC uses a slow
pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write.
- lazy become. The JIT's use of inline cacheing provides a cheap way
of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses and, because it is a hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. We have yet to determine exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See below on how we deal with becomes on objects with named inst vars. Note that we probably don't have to worry about zero-sized objects. These are unlikely to be passed through the FFI (there is nothing to pass :) ) and so will rarely be becommed. If they do, they can become slowly. Alternatively we can insist that objects are at least 16 bytes in size (see a8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate classes indices corresponding to their tag pattern (SmallInteger = 1, Character = 3, SmallFloat = 4?) we can use all the class indices from 0 to 7 for special uses, 0 = forward, and e.g. 1 = header-sized filler.
- pinning. To support a robust and easy-to-use FFI the memory manager
must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space.
- efficient old space collection. An incremental collector (a la
Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. (see free space/free list below)
- 8-byte alignment. It is advantageous for the FFI, for
floating-point access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that
- the starts of all spaces are aligned on 8-byte boundaries - object allocation rounds up the requested size to a multiple of
8 bytes
- the overflow size field is also 8 bytes
We shall probably keep the minimum object size at 16 bytes so that
there is always room for a forwarding pointer. But this implies that we will need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g.
chunkSizeOf: oop <var: #oop type: #'object *'> ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue:
[OverflowSizeBytes]
ifFalse: [0]) + (object slotSize * BytesPerSlot)] chunkStartOf: oop <var: #oop type: #'object *'> ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes])
For the moment we do not tackle the issue of heap growth and shrinkage
with the ability to allocate and deallocate heap segments via memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. We may revisit this but it is not a key requirement for the first implementation.
The basic approach is to use a fixed size new space and a growable old
space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space.
To provide apparent pinning in new space we rely on lazy become.
Since most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing.
To provide pinning in old space, large objects are implicitly pinned
(because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation
- as in aggregates, small objects can be used to fill-in the spaces between
karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step.
Free space in old space is organized by a free list/free tree as in
Eliot's VisualWorks 5i old space allocator. There are 64 free lists, indices 1 through 63 holding blocks of space of that size, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coallesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in a 64-bit integer so that an allocation of size 63 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads.
The incremental collector (a la Dijkstra's three colour algorithm)
collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list).
The incremental collector implies a more complex write barrier.
Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially
target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]]
"target now refers to a young object; it is a root for scavenges"
ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add newValue
to IGC's markStack for subsequent scanning"
The incremental collector does not detect already marked objects all
of whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire.
Lazy become.
As described earlier the basic idea behind lazy become is to use
corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme cannot be used on objects with named inst vars without adding checking to all inst var accesses, which we judge too expensive. Instead, when becomming objects with named inst vars, we scan all activations in the stack zone, eagerly becomming these references, and we check for corpses when faulting in a context into the stack zone. Essentially, the invariant is that there are no references to corpses from the receiver slots of stack activations. A detail is whether we allow or forbid pinning of closure indirection vectors, or scan the entire stack of each activation. Using a special class index pun for indirection vectors is a cheap way of preventing their becomming/pinning etc. Although "don't do that" (don't attempt to pin/become indirection vectors) is also an acceptable response.
60-bit immediate Floats Representation for immediate doubles, only used in the 64-bit
implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0.
So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least
significant bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf:
msb
lsb
[8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag bits]
So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb
lsb
Decode:
[8expsubset][52mantissa][1s][3tags]
shift away tags: [ 000
][8expsubset][52mantissa][1s]
add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent
][52mantissa]
Encode: [1s][ 11 exponent
][52mantissa]
rot sign: [ 11 exponent
][52mantissa][1s]
sub exponent offset: [ 000 ][8expsubset][52 mantissa][1s] shift: [8expsubset][52
mantissa][1s][ 000 ]
or/add tags: [8expsubset][52mantissa][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the
source) move bits through memory ( quadword = *(q64 *)&doubleVariable).
Issues: How do we avoid the Size4Bit for 64-bits? The format word encodes the
number of odd bytes, but currently has only 4 bits and hence only supports odd bytes of 0 - 3. We need odd bytes of 0 - 7. But I don't like the separate Size4Bit. Best to change the VI code and have a 5 bit format? We loose one bit but save two bits (isEphemeron and isWeak (or three, if isPointers)) for a net gain of one (or two)
Further, keep Squeak's format idea or go for separate bits? For
64-bits we need a 5 bit format field. This contrasts with isPointers, isWeak, isEphemeron, 3 odd size bits (or byte size).. format field is quite economical.
Are class indices in inline caches strong references to classes or
weak references?
If strong then they must be scanned during GC and the methodZone must
be flushed on fullGC to reclaim all classes (this looks to be a bug in the V3 Cogit).
If weak then when the class table loses references, PICs containing
freed classes must be freed and then sends to freed PICs or containing freed clases must be unlinked.
The second approach is faster; the common case is scanning the class
table, the uncommon case is freeing classes. The second approach is better; in-line caches do not prevent reclamation of classes.
Stef
-- best, Eliot
-- best, Eliot
On Wed, May 30, 2012 at 1:53 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Wed, May 30, 2012 at 10:22 PM, Eliot Miranda eliot.miranda@gmail.comwrote:
On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < stephane.ducasse@inria.fr> wrote:
I would like to be sure that we can have - bit for immutable objects - bits for experimenting.
There will be quite a few. And one will be able to steal bits from the class field if one needs fewer classes. I'm not absolutely sure of the layout yet. But for example
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash
and we can make it lazy, that is, we compute it not at instantiation time but rather the first time the primitive is call.
20: class index
This would probably work for a while. I think that it would be good to let an "open door" so that in the future we can just add one more word for a class pointer.
Turns out that's not such a simple change. Class indices have two advantages. One is that they're more compact (2^20 classes is still a lot of classes). The other is that they're constant, which has two main benefits. First, in method caches and in-line caches the class field holds an index and hence doesn't need to be updated by the GC. The GC no longer has top visit send sites. Second, because they're constants both checking for well-known classes and instantiating well-known classes can be done without going to the specialObjectsArray. One just uses the constant. Now undoing these optimizations to open a back-dorr is not trivial. So best accept the benefits and exploit them to a maximum.
still leaves 5 bits unused. And stealing 4 bits each from class index still leaves 64k classes. So this format is simple and provides lots of unused bits. The format field is a great idea as it combines a number of orthogonal properties in very few bits. I don't want to include odd bytes in format because I think a separate field that holds odd bytes and fixed fields is better use of space. But we can gather statistics before deciding.
Stef
On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote:
Hi guys
Here is an important topic I would like to see discussed so that we
see how we can improve and join forces.
May a mail discussion then a wiki for the summary would be good.
stef
Begin forwarded message:
From: Eliot Miranda eliot.miranda@gmail.com Subject: Re: Plan/discussion/communication around new object format Date: May 27, 2012 10:49:54 PM GMT+02:00 To: Stéphane Ducasse stephane.ducasse@inria.fr
On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse <
stephane.ducasse@inria.fr> wrote:
Hi eliot
do you have a description of the new object format you want to
introduce?
The design is in the class comment of CogMemoryManager in the Cog
VMMaker packages.
Then what is your schedule?
This is difficult. I have made a small start and should be able to
spend time on it starting soon. I want to have it finished by early next year. But it depends on work schedules etc.
I would like to see if we can allocate igor/esteban time before we
run out of money
to help on that important topic. Now the solution is unclear and I did not see any document where we
can evaluate
and plan how we can help. So do you want help on that topic? Then how
can people
contribute if any?
The first thing to do is to read the design document, to see if the
Pharo community thinks it is the right direction, and to review it, spot deficiencies etc. So please get those interested to read the class comment of CogMemoryManager in the latest VMMaker.oscog.
Here's the current version of it:
CogMemoryManager is currently a place-holder for the design of the
new Cog VM's object representation and garbage collector. The goals for the GC are
- efficient object representation a la Eliot Miranda's VisualWorks
64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse weak table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. At least for a few years 64k classes should be enough. A class is entered into the class table in the following operations:
behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table
behaviorHash is a special version of identityHash that must be
implemented in the image by any object that can function as a class (i.e. Behavior).
- more immediate classes. An immediate Character class would speed
up String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11. 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy ad makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like
| 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits |
For details see "60-bit immediate Floats" below.
- efficient scavenging. The current Squeak GC uses a slow
pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write.
- lazy become. The JIT's use of inline cacheing provides a cheap way
of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses and, because it is a hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. We have yet to determine exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See below on how we deal with becomes on objects with named inst vars. Note that we probably don't have to worry about zero-sized objects. These are unlikely to be passed through the FFI (there is nothing to pass :) ) and so will rarely be becommed. If they do, they can become slowly. Alternatively we can insist that objects are at least 16 bytes in size (see a8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate classes indices corresponding to their tag pattern (SmallInteger = 1, Character = 3, SmallFloat = 4?) we can use all the class indices from 0 to 7 for special uses, 0 = forward, and e.g. 1 = header-sized filler.
- pinning. To support a robust and easy-to-use FFI the memory
manager must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space.
- efficient old space collection. An incremental collector (a la
Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. (see free space/free list below)
- 8-byte alignment. It is advantageous for the FFI, for
floating-point access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that
- the starts of all spaces are aligned on 8-byte boundaries - object allocation rounds up the requested size to a multiple
of 8 bytes
- the overflow size field is also 8 bytes
We shall probably keep the minimum object size at 16 bytes so that
there is always room for a forwarding pointer. But this implies that we will need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g.
chunkSizeOf: oop <var: #oop type: #'object *'> ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue:
[OverflowSizeBytes]
ifFalse: [0]) + (object slotSize * BytesPerSlot)] chunkStartOf: oop <var: #oop type: #'object *'> ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes])
For the moment we do not tackle the issue of heap growth and
shrinkage with the ability to allocate and deallocate heap segments via memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. We may revisit this but it is not a key requirement for the first implementation.
The basic approach is to use a fixed size new space and a growable
old space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space.
To provide apparent pinning in new space we rely on lazy become.
Since most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing.
To provide pinning in old space, large objects are implicitly pinned
(because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation
- as in aggregates, small objects can be used to fill-in the spaces between
karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step.
Free space in old space is organized by a free list/free tree as in
Eliot's VisualWorks 5i old space allocator. There are 64 free lists, indices 1 through 63 holding blocks of space of that size, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coallesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in a 64-bit integer so that an allocation of size 63 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads.
The incremental collector (a la Dijkstra's three colour algorithm)
collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list).
The incremental collector implies a more complex write barrier.
Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially
target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]]
"target now refers to a young object; it is a root for scavenges"
ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add
newValue to IGC's markStack for subsequent scanning"
The incremental collector does not detect already marked objects all
of whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire.
Lazy become.
As described earlier the basic idea behind lazy become is to use
corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme cannot be used on objects with named inst vars without adding checking to all inst var accesses, which we judge too expensive. Instead, when becomming objects with named inst vars, we scan all activations in the stack zone, eagerly becomming these references, and we check for corpses when faulting in a context into the stack zone. Essentially, the invariant is that there are no references to corpses from the receiver slots of stack activations. A detail is whether we allow or forbid pinning of closure indirection vectors, or scan the entire stack of each activation. Using a special class index pun for indirection vectors is a cheap way of preventing their becomming/pinning etc. Although "don't do that" (don't attempt to pin/become indirection vectors) is also an acceptable response.
60-bit immediate Floats Representation for immediate doubles, only used in the 64-bit
implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0.
So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least
significant bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf:
msb
lsb
[8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag
bits]
So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb
lsb
Decode:
[8expsubset][52mantissa][1s][3tags]
shift away tags: [ 000
][8expsubset][52mantissa][1s]
add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent
][52mantissa]
Encode: [1s][ 11 exponent
][52mantissa]
rot sign: [ 11 exponent
][52mantissa][1s]
sub exponent offset: [ 000 ][8expsubset][52 mantissa][1s] shift: [8expsubset][52
mantissa][1s][ 000 ]
or/add tags: [8expsubset][52mantissa][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the
source) move bits through memory ( quadword = *(q64 *)&doubleVariable).
Issues: How do we avoid the Size4Bit for 64-bits? The format word encodes
the number of odd bytes, but currently has only 4 bits and hence only supports odd bytes of 0 - 3. We need odd bytes of 0 - 7. But I don't like the separate Size4Bit. Best to change the VI code and have a 5 bit format? We loose one bit but save two bits (isEphemeron and isWeak (or three, if isPointers)) for a net gain of one (or two)
Further, keep Squeak's format idea or go for separate bits? For
64-bits we need a 5 bit format field. This contrasts with isPointers, isWeak, isEphemeron, 3 odd size bits (or byte size).. format field is quite economical.
Are class indices in inline caches strong references to classes or
weak references?
If strong then they must be scanned during GC and the methodZone must
be flushed on fullGC to reclaim all classes (this looks to be a bug in the V3 Cogit).
If weak then when the class table loses references, PICs containing
freed classes must be freed and then sends to freed PICs or containing freed clases must be unlinked.
The second approach is faster; the common case is scanning the class
table, the uncommon case is freeing classes. The second approach is better; in-line caches do not prevent reclamation of classes.
Stef
-- best, Eliot
-- best, Eliot
-- Mariano http://marianopeck.wordpress.com
Hi Eliot:
From my experience with the RoarVM, it seems to be a rather simple exercise to enable the VM to support a custom 'pre-header' for objects. That is, a constant offset in the memory that comes from the allocator, and is normally ignored by the GC.
That allows me to do all kind of stuff. Of course at a cost of a word per object, and at the cost of recompiling the VM. But, that should be a reasonable price to pay for someone doing research on these kind of things.
Sometimes a few bits are just not enough, and such a pre-header gives much much more flexibility. For the people interested in that, I could dig out the details (I think, I did that already once on this list).
Best regards Stefan
On 30 May 2012, at 22:22, Eliot Miranda wrote:
On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse stephane.ducasse@inria.fr wrote: I would like to be sure that we can have - bit for immutable objects - bits for experimenting.
There will be quite a few. And one will be able to steal bits from the class field if one needs fewer classes. I'm not absolutely sure of the layout yet. But for example
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index
still leaves 5 bits unused. And stealing 4 bits each from class index still leaves 64k classes. So this format is simple and provides lots of unused bits. The format field is a great idea as it combines a number of orthogonal properties in very few bits. I don't want to include odd bytes in format because I think a separate field that holds odd bytes and fixed fields is better use of space. But we can gather statistics before deciding.
Stef
On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote:
Hi guys
Here is an important topic I would like to see discussed so that we see how we can improve and join forces. May a mail discussion then a wiki for the summary would be good.
stef
Begin forwarded message:
From: Eliot Miranda eliot.miranda@gmail.com Subject: Re: Plan/discussion/communication around new object format Date: May 27, 2012 10:49:54 PM GMT+02:00 To: Stéphane Ducasse stephane.ducasse@inria.fr
On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse stephane.ducasse@inria.fr wrote: Hi eliot
do you have a description of the new object format you want to introduce?
The design is in the class comment of CogMemoryManager in the Cog VMMaker packages.
Then what is your schedule?
This is difficult. I have made a small start and should be able to spend time on it starting soon. I want to have it finished by early next year. But it depends on work schedules etc.
I would like to see if we can allocate igor/esteban time before we run out of money to help on that important topic. Now the solution is unclear and I did not see any document where we can evaluate and plan how we can help. So do you want help on that topic? Then how can people contribute if any?
The first thing to do is to read the design document, to see if the Pharo community thinks it is the right direction, and to review it, spot deficiencies etc. So please get those interested to read the class comment of CogMemoryManager in the latest VMMaker.oscog.
Here's the current version of it:
CogMemoryManager is currently a place-holder for the design of the new Cog VM's object representation and garbage collector. The goals for the GC are
- efficient object representation a la Eliot Miranda's VisualWorks 64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse weak table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. At least for a few years 64k classes should be enough. A class is entered into the class table in the following operations: behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table
behaviorHash is a special version of identityHash that must be implemented in the image by any object that can function as a class (i.e. Behavior).
- more immediate classes. An immediate Character class would speed up String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11. 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy ad makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like | 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits |
For details see "60-bit immediate Floats" below.
efficient scavenging. The current Squeak GC uses a slow pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write.
lazy become. The JIT's use of inline cacheing provides a cheap way of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses and, because it is a hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. We have yet to determine exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See below on how we deal with becomes on objects with named inst vars. Note that we probably don't have to worry about zero-sized objects. These are unlikely to be passed through the FFI (there is nothing to pass :) ) and so will rarely be becommed. If they do, they can become slowly. Alternatively we can insist that objects are at least 16 bytes in size (see a8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate classes indices corresponding to their tag pattern (SmallInteger = 1, Character = 3, SmallFloat = 4?) we can use all the class indices from 0 to 7 for special uses, 0 = forward, and e.g. 1 = header-sized filler.
pinning. To support a robust and easy-to-use FFI the memory manager must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space.
efficient old space collection. An incremental collector (a la Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. (see free space/free list below)
8-byte alignment. It is advantageous for the FFI, for floating-point access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that
- the starts of all spaces are aligned on 8-byte boundaries
- object allocation rounds up the requested size to a multiple of 8 bytes
- the overflow size field is also 8 bytes
We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. But this implies that we will need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g. chunkSizeOf: oop <var: #oop type: #'object *'> ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue: [OverflowSizeBytes] ifFalse: [0]) + (object slotSize * BytesPerSlot)]
chunkStartOf: oop <var: #oop type: #'object *'> ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes])
For the moment we do not tackle the issue of heap growth and shrinkage with the ability to allocate and deallocate heap segments via memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. We may revisit this but it is not a key requirement for the first implementation.
The basic approach is to use a fixed size new space and a growable old space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space.
To provide apparent pinning in new space we rely on lazy become. Since most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing.
To provide pinning in old space, large objects are implicitly pinned (because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation - as in aggregates, small objects can be used to fill-in the spaces between karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step.
Free space in old space is organized by a free list/free tree as in Eliot's VisualWorks 5i old space allocator. There are 64 free lists, indices 1 through 63 holding blocks of space of that size, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coallesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in a 64-bit integer so that an allocation of size 63 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads.
The incremental collector (a la Dijkstra's three colour algorithm) collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list).
The incremental collector implies a more complex write barrier. Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]] "target now refers to a young object; it is a root for scavenges" ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add newValue to IGC's markStack for subsequent scanning"
The incremental collector does not detect already marked objects all of whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire.
Lazy become.
As described earlier the basic idea behind lazy become is to use corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme cannot be used on objects with named inst vars without adding checking to all inst var accesses, which we judge too expensive. Instead, when becomming objects with named inst vars, we scan all activations in the stack zone, eagerly becomming these references, and we check for corpses when faulting in a context into the stack zone. Essentially, the invariant is that there are no references to corpses from the receiver slots of stack activations. A detail is whether we allow or forbid pinning of closure indirection vectors, or scan the entire stack of each activation. Using a special class index pun for indirection vectors is a cheap way of preventing their becomming/pinning etc. Although "don't do that" (don't attempt to pin/become indirection vectors) is also an acceptable response.
60-bit immediate Floats Representation for immediate doubles, only used in the 64-bit implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0. So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least significant bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf: msb lsb [8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag bits] So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb lsb Decode: [8expsubset][52mantissa][1s][3tags] shift away tags: [ 000 ][8expsubset][52mantissa][1s] add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent ][52mantissa]
Encode: [1s][ 11 exponent ][52mantissa] rot sign: [ 11 exponent ][52mantissa][1s] sub exponent offset: [ 000 ][8expsubset][52 mantissa][1s] shift: [8expsubset][52 mantissa][1s][ 000 ] or/add tags: [8expsubset][52mantissa][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the source) move bits through memory ( quadword = *(q64 *)&doubleVariable).
Issues: How do we avoid the Size4Bit for 64-bits? The format word encodes the number of odd bytes, but currently has only 4 bits and hence only supports odd bytes of 0 - 3. We need odd bytes of 0 - 7. But I don't like the separate Size4Bit. Best to change the VI code and have a 5 bit format? We loose one bit but save two bits (isEphemeron and isWeak (or three, if isPointers)) for a net gain of one (or two)
Further, keep Squeak's format idea or go for separate bits? For 64-bits we need a 5 bit format field. This contrasts with isPointers, isWeak, isEphemeron, 3 odd size bits (or byte size).. format field is quite economical.
Are class indices in inline caches strong references to classes or weak references? If strong then they must be scanned during GC and the methodZone must be flushed on fullGC to reclaim all classes (this looks to be a bug in the V3 Cogit). If weak then when the class table loses references, PICs containing freed classes must be freed and then sends to freed PICs or containing freed clases must be unlinked. The second approach is faster; the common case is scanning the class table, the uncommon case is freeing classes. The second approach is better; in-line caches do not prevent reclamation of classes.
Stef
-- best, Eliot
-- best, Eliot
On Wed, May 30, 2012 at 1:57 PM, Stefan Marr smalltalk@stefan-marr.dewrote:
Hi Eliot:
From my experience with the RoarVM, it seems to be a rather simple exercise to enable the VM to support a custom 'pre-header' for objects. That is, a constant offset in the memory that comes from the allocator, and is normally ignored by the GC.
That's a great idea. So for experimental use one simply throws a whole word at the problem and forgets about the issue. Thanks, Stefan. That leaves me free to focus on something fast and compact that is still flexible. Great!
That allows me to do all kind of stuff. Of course at a cost of a word per object, and at the cost of recompiling the VM. But, that should be a reasonable price to pay for someone doing research on these kind of things.
Sometimes a few bits are just not enough, and such a pre-header gives much much more flexibility. For the people interested in that, I could dig out the details (I think, I did that already once on this list).
Best regards Stefan
On 30 May 2012, at 22:22, Eliot Miranda wrote:
On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse <
stephane.ducasse@inria.fr> wrote:
I would like to be sure that we can have - bit for immutable objects - bits for experimenting.
There will be quite a few. And one will be able to steal bits from the
class field if one needs fewer classes. I'm not absolutely sure of the layout yet. But for example
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for
pointer, 7 => # fixed fields is in the class)
4 bits: format (pointers, indexable, bytes/shorts/longs/doubles
indexable, compiled method, ephemerons, weak, etc)
1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index
still leaves 5 bits unused. And stealing 4 bits each from class index
still leaves 64k classes. So this format is simple and provides lots of unused bits. The format field is a great idea as it combines a number of orthogonal properties in very few bits. I don't want to include odd bytes in format because I think a separate field that holds odd bytes and fixed fields is better use of space. But we can gather statistics before deciding.
Stef
On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote:
Hi guys
Here is an important topic I would like to see discussed so that we
see how we can improve and join forces.
May a mail discussion then a wiki for the summary would be good.
stef
Begin forwarded message:
From: Eliot Miranda eliot.miranda@gmail.com Subject: Re: Plan/discussion/communication around new object format Date: May 27, 2012 10:49:54 PM GMT+02:00 To: Stéphane Ducasse stephane.ducasse@inria.fr
On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse <
stephane.ducasse@inria.fr> wrote:
Hi eliot
do you have a description of the new object format you want to
introduce?
The design is in the class comment of CogMemoryManager in the Cog
VMMaker packages.
Then what is your schedule?
This is difficult. I have made a small start and should be able to
spend time on it starting soon. I want to have it finished by early next year. But it depends on work schedules etc.
I would like to see if we can allocate igor/esteban time before we
run out of money
to help on that important topic. Now the solution is unclear and I did not see any document where we
can evaluate
and plan how we can help. So do you want help on that topic? Then how
can people
contribute if any?
The first thing to do is to read the design document, to see if the
Pharo community thinks it is the right direction, and to review it, spot deficiencies etc. So please get those interested to read the class comment of CogMemoryManager in the latest VMMaker.oscog.
Here's the current version of it:
CogMemoryManager is currently a place-holder for the design of the
new Cog VM's object representation and garbage collector. The goals for the GC are
- efficient object representation a la Eliot Miranda's VisualWorks
64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse weak table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. At least for a few years 64k classes should be enough. A class is entered into the class table in the following operations:
behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table
behaviorHash is a special version of identityHash that must be
implemented in the image by any object that can function as a class (i.e. Behavior).
- more immediate classes. An immediate Character class would speed
up String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11. 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy ad makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like
| 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits |
For details see "60-bit immediate Floats" below.
- efficient scavenging. The current Squeak GC uses a slow
pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write.
- lazy become. The JIT's use of inline cacheing provides a cheap way
of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses and, because it is a hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. We have yet to determine exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See below on how we deal with becomes on objects with named inst vars. Note that we probably don't have to worry about zero-sized objects. These are unlikely to be passed through the FFI (there is nothing to pass :) ) and so will rarely be becommed. If they do, they can become slowly. Alternatively we can insist that objects are at least 16 bytes in size (see a8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate classes indices corresponding to their tag pattern (SmallInteger = 1, Character = 3, SmallFloat = 4?) we can use all the class indices from 0 to 7 for special uses, 0 = forward, and e.g. 1 = header-sized filler.
- pinning. To support a robust and easy-to-use FFI the memory
manager must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space.
- efficient old space collection. An incremental collector (a la
Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. (see free space/free list below)
- 8-byte alignment. It is advantageous for the FFI, for
floating-point access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that
- the starts of all spaces are aligned on 8-byte boundaries - object allocation rounds up the requested size to a multiple
of 8 bytes
- the overflow size field is also 8 bytes
We shall probably keep the minimum object size at 16 bytes so that
there is always room for a forwarding pointer. But this implies that we will need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g.
chunkSizeOf: oop <var: #oop type: #'object *'> ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue:
[OverflowSizeBytes]
ifFalse: [0]) + (object slotSize * BytesPerSlot)] chunkStartOf: oop <var: #oop type: #'object *'> ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes])
For the moment we do not tackle the issue of heap growth and
shrinkage with the ability to allocate and deallocate heap segments via memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. We may revisit this but it is not a key requirement for the first implementation.
The basic approach is to use a fixed size new space and a growable
old space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space.
To provide apparent pinning in new space we rely on lazy become.
Since most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing.
To provide pinning in old space, large objects are implicitly pinned
(because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation
- as in aggregates, small objects can be used to fill-in the spaces between
karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step.
Free space in old space is organized by a free list/free tree as in
Eliot's VisualWorks 5i old space allocator. There are 64 free lists, indices 1 through 63 holding blocks of space of that size, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coallesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in a 64-bit integer so that an allocation of size 63 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads.
The incremental collector (a la Dijkstra's three colour algorithm)
collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list).
The incremental collector implies a more complex write barrier.
Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially
target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]]
"target now refers to a young object; it is a root for scavenges"
ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add
newValue to IGC's markStack for subsequent scanning"
The incremental collector does not detect already marked objects all
of whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire.
Lazy become.
As described earlier the basic idea behind lazy become is to use
corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme cannot be used on objects with named inst vars without adding checking to all inst var accesses, which we judge too expensive. Instead, when becomming objects with named inst vars, we scan all activations in the stack zone, eagerly becomming these references, and we check for corpses when faulting in a context into the stack zone. Essentially, the invariant is that there are no references to corpses from the receiver slots of stack activations. A detail is whether we allow or forbid pinning of closure indirection vectors, or scan the entire stack of each activation. Using a special class index pun for indirection vectors is a cheap way of preventing their becomming/pinning etc. Although "don't do that" (don't attempt to pin/become indirection vectors) is also an acceptable response.
60-bit immediate Floats Representation for immediate doubles, only used in the 64-bit
implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0.
So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least
significant bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf:
msb
lsb
[8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag
bits]
So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb
lsb
Decode:
[8expsubset][52mantissa][1s][3tags]
shift away tags: [ 000
][8expsubset][52mantissa][1s]
add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent
][52mantissa]
Encode: [1s][ 11 exponent
][52mantissa]
rot sign: [ 11 exponent
][52mantissa][1s]
sub exponent offset: [ 000 ][8expsubset][52 mantissa][1s] shift: [8expsubset][52
mantissa][1s][ 000 ]
or/add tags: [8expsubset][52mantissa][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the
source) move bits through memory ( quadword = *(q64 *)&doubleVariable).
Issues: How do we avoid the Size4Bit for 64-bits? The format word encodes
the number of odd bytes, but currently has only 4 bits and hence only supports odd bytes of 0 - 3. We need odd bytes of 0 - 7. But I don't like the separate Size4Bit. Best to change the VI code and have a 5 bit format? We loose one bit but save two bits (isEphemeron and isWeak (or three, if isPointers)) for a net gain of one (or two)
Further, keep Squeak's format idea or go for separate bits? For
64-bits we need a 5 bit format field. This contrasts with isPointers, isWeak, isEphemeron, 3 odd size bits (or byte size).. format field is quite economical.
Are class indices in inline caches strong references to classes or
weak references?
If strong then they must be scanned during GC and the methodZone must
be flushed on fullGC to reclaim all classes (this looks to be a bug in the V3 Cogit).
If weak then when the class table loses references, PICs containing
freed classes must be freed and then sends to freed PICs or containing freed clases must be unlinked.
The second approach is faster; the common case is scanning the class
table, the uncommon case is freeing classes. The second approach is better; in-line caches do not prevent reclamation of classes.
Stef
-- best, Eliot
-- best, Eliot
-- Stefan Marr Software Languages Lab Vrije Universiteit Brussel Pleinlaan 2 / B-1050 Brussels / Belgium http://soft.vub.ac.be/~smarr Phone: +32 2 629 2974 Fax: +32 2 629 3525
Here are couple (2) of mine, highly valuable cents :)
2^20 for classes? might be fine (or even overkill) for smalltalk systems, but could be quite limiting for one who would like experiment and implementing a prototype-based frameworks, where every object is a "class" by itself.
--- 8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index --- what takes most of the space in object header? right - hash! Now, since we will have lazy become i am back to my idea of having extra & arbitrary properties per object.
In a nutshell, the idea is to not store hash in an object header, but instead use just a single bit 'hash present'.
When identity hash of object is requested (via corresponding primitive) the implementation could check if 'hash present' is set, then if it's not there , we do a 'lazy become' of existing object to same object copied into another place, but with hash bit set, and with extra 64-bit field, where hash value can be stored.
So, when you requesting an identity hash for object which don't have it, the object of from: [header][...data.. ] copied to new memory region with new layout: [header][hash bits][...data..]
and old object, is of course 'corpsed' to forwarding pointer to new location.
Next step is going from holding just hash to having an arbitrary & dynamic number of extra fields per object. In same way, we use 1 extra bit, indicating that object having extra properties. And when object don't have it, we lazy-become it from being [header][...data.. ] or [header][hash bits][...data..] to: [header][hash bits][oop][...data..]
where 'oop' can be anything - instance of Array/Dictionary (depends how language-side will decide to store extra properties of object)
This , for instance , would allow us to store extra properties for special object formats like variable bytes or compiled methods, which don't have the instance variables.
Not need to mention, how useful being able to attach extra properties per object, without changing the object's class. And , of course the freed 18 bits (20 - 2) in header can be allocated for other purposes. (Stef, how many bits you need for experiments? ;)
------------
About immediates zoo.
Keep in mind, that the more immediates we have, the more complex implementation tends to be.
I would just keep 2 data types: - integers - floats
and third, special 'arbitrary' immediate , which seen by VM as a 60-bit value. The interpretation of this value depends on lookup in range-table, where developer specifying the correspondence between the value interval and class: [min .. max] -> class
intervals, of course, cannot overlap. Determining a class of such immediate might be slower - O(log2(n)) at best (where n is size of range table), but from other side, how many different kinds of immediates you can fit into 60-bit value? Right, it is 2^60. Much more than proposed 8 isn't? :)
And this extra cost can be mitigated completely by inline cache. - in case of regular reference, you must fetch the object's class and then compare it with one, stored in cache. - in case of immediate reference, you compare immediate value with min and max stored in cache fields. And if value is in range, you got a cache hit, and free to proceed. So, its just 1 extra comparison comparing to 'classic' inline cache.
And, after thinking how inline cache is organized, now you can scratch the first my paragraph related to immediates! We really don't need to discriminate between small integers/floats/rest!! They could also be nothing more than just one of a range(s) defined in our zoo of 'special' immediates!
So, at the end we will have just two kinds of references: - zero bit == 0 -- an object pointer - zero bit == 1 -- an immediate
Voila!.
We can have real zoo of immediates, and simple implementation to support them. And not saying that range-table is provided by language-side, so we're free to rearrange them at any moment.
And of course, it doesn't means that VM could not reserve some of the ranges for own 'contracted' immediates, like Characters, and even class reference for example. Think about it :)
On Wed, 30 May 2012, Igor Stasenko wrote:
Here are couple (2) of mine, highly valuable cents :)
2^20 for classes? might be fine (or even overkill) for smalltalk systems, but could be quite limiting for one who would like experiment and implementing a prototype-based frameworks, where every object is a "class" by itself.
I think it's more important to have a fast Smalltalk VM, than one that is slower, but might fit for a concrete experiment which might happen sometime and would get some performance boost from the implementation.
8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index
what takes most of the space in object header? right - hash! Now, since we will have lazy become i am back to my idea of having extra & arbitrary properties per object.
In a nutshell, the idea is to not store hash in an object header, but instead use just a single bit 'hash present'.
When identity hash of object is requested (via corresponding primitive) the implementation could check if 'hash present' is set, then if it's not there , we do a 'lazy become' of existing object to same object copied into another place, but with hash bit set, and with extra 64-bit field, where hash value can be stored.
So, when you requesting an identity hash for object which don't have it, the object of from: [header][...data.. ] copied to new memory region with new layout: [header][hash bits][...data..]
and old object, is of course 'corpsed' to forwarding pointer to new location.
The weak point of this idea is that you might run of out memory during the allocation of the new object if you ask the identity hash of a larger object or many smaller objects at once.
Next step is going from holding just hash to having an arbitrary & dynamic number of extra fields per object. In same way, we use 1 extra bit, indicating that object having extra properties. And when object don't have it, we lazy-become it from being [header][...data.. ] or [header][hash bits][...data..] to: [header][hash bits][oop][...data..]
where 'oop' can be anything - instance of Array/Dictionary (depends how language-side will decide to store extra properties of object)
This , for instance , would allow us to store extra properties for special object formats like variable bytes or compiled methods, which don't have the instance variables.
Not need to mention, how useful being able to attach extra properties per object, without changing the object's class. And , of course the freed 18 bits (20 - 2) in header can be allocated for other purposes. (Stef, how many bits you need for experiments? ;)
About immediates zoo.
Keep in mind, that the more immediates we have, the more complex implementation tends to be.
I would just keep 2 data types:
- integers
- floats
and third, special 'arbitrary' immediate , which seen by VM as a 60-bit value. The interpretation of this value depends on lookup in range-table, where developer specifying the correspondence between the value interval and class: [min .. max] -> class
intervals, of course, cannot overlap. Determining a class of such immediate might be slower - O(log2(n)) at best (where n is size of range table), but from other side, how many different kinds of immediates you can fit into 60-bit value? Right, it is 2^60. Much more than proposed 8 isn't? :)
And this extra cost can be mitigated completely by inline cache.
- in case of regular reference, you must fetch the object's class and
then compare it with one, stored in cache.
- in case of immediate reference, you compare immediate value with min
and max stored in cache fields. And if value is in range, you got a cache hit, and free to proceed. So, its just 1 extra comparison comparing to 'classic' inline cache.
And, after thinking how inline cache is organized, now you can scratch the first my paragraph related to immediates! We really don't need to discriminate between small integers/floats/rest!! They could also be nothing more than just one of a range(s) defined in our zoo of 'special' immediates!
So, at the end we will have just two kinds of references:
- zero bit == 0 -- an object pointer
- zero bit == 1 -- an immediate
Voila!.
We can have real zoo of immediates, and simple implementation to support them. And not saying that range-table is provided by language-side, so we're free to rearrange them at any moment.
And of course, it doesn't means that VM could not reserve some of the ranges for own 'contracted' immediates, like Characters, and even class reference for example. Think about it :)
I like the idea, but I'm not sure how useful it will be in practice. I'd also add characters as third data type. String/Character operations should be as fast as possible.
Levente
Some extra ideas.
1. Avoiding extra header for big sized objects. I not sure about this, but still ..
according to Eliot's design: 8: slot size (255 => extra header word with large size)
What if we extend size to 16 bits (so in total it will be 65536 slots) and we have a single flag, pointing how to calculate object size:
flag(0) object size = (size field) * 8 flag(1) object size = 2^ (slot field)
which means that past 2^16 (or how many bits we dedicate to size field in header) all object sizes will be power of two. Since most of the objects will fit under 2^16, we don't lose much. For big arrays, we could have a special collection/array, which will store exact size in it's inst var (and we even don't need to care in cases of Sets/Dicts/OrderedCollections). Also we can actually make it transparent:
Array class>>new: size size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ]
of course, care must be taken for those variable classes which potentially can hold large amounts of bytes (like Bitmap). But i think code can be quickly adopted to this feature of VM, which will simply fail a #new: primitive if size is not power of two for sizes greater than max "exact size" which can fit into size field of header. ----
2. Slot for arbitrary properties. If you read carefully, Eliot said that for making lazy become it is necessary to always have some extra space per object, even if object don't have any fields:
<<We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. >>
So, this fits quite well with idea of having slot for dynamic properties per object. What if instead of "extending object" when it requires extra properties slot, we just reserve the slot for properties at the very beginning:
[ header ] [ properties slot] ... rest of data ..
so, any object will have that slot. And in case of lazy-become. we can use that slot for holding forwarding pointer. Voila.
3. From 2. we going straight back to hash.. VM don't needs to know such a thing as object's hash, it has no semantic load inside VM, it just answers those bits by a single primitive.
So, why it is kind of enforced inherent property of all objects in system? And why nobody asks, if we have that one, why we could not have more than one or as many as we want? This is my central question around idea of having per-object properties. Once VM will guarantee that any object can have at least one slot for storing object reference (property slot), then it is no longer needed for VM to care about identity hash.
Because it can be implemented completely at language size. But most of all, we are NO longer limited how big/small hash values , which directly converts into bonuses: less hash collisions > more performance. Want 64-bit hash? 128-bit? Whatever you desire:
Object>>identityHash ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]
and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box. Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
On 2012-06-11, at 1:36 AM, Igor Stasenko wrote:
Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
Well, it goes a little further than that. I think immutability is generally useful for any system that persists objects outside the image. OODBs are one example, but the same applies for ORM, LOOM-style virtual memory, or even syncing of state across the network. I've even wished for immutability when working on web applications. It's a join point for any aspects related to state.
Arbitrary properties are actually used quite a bit already, they just don't have VM support. Morphic and Tweak use them extensively, as does the dependency system. I suspect we'd find that a lot of hacks and kludges could be subsumed by VM-supported arbitrary properties. (e.g., ephemerons).
So yeah, +1 to arbitrary properties.
Colin
if you remember i said before, that i don't like immutability enforced through VM contract.
imagine two frameworks, using immutability flag for their own purposes, and contending for ownership of same object(s)..
IMO, there are other , better, solutions to that but i'm not going to go in details...
On 11 June 2012 18:05, Colin Putney colin@wiresong.com wrote:
On 2012-06-11, at 1:36 AM, Igor Stasenko wrote:
Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
Well, it goes a little further than that. I think immutability is generally useful for any system that persists objects outside the image. OODBs are one example, but the same applies for ORM, LOOM-style virtual memory, or even syncing of state across the network. I've even wished for immutability when working on web applications. It's a join point for any aspects related to state.
Arbitrary properties are actually used quite a bit already, they just don't have VM support. Morphic and Tweak use them extensively, as does the dependency system. I suspect we'd find that a lot of hacks and kludges could be subsumed by VM-supported arbitrary properties. (e.g., ephemerons).
So yeah, +1 to arbitrary properties.
Colin
if you remember i said before, that i don't like immutability enforced through VM contract.
imagine two frameworks, using immutability flag for their own purposes, and contending for ownership of same object(s)..
IMO, there are other , better, solutions to that but i'm not going to go in details...
I totally agree with you here, too. An immutability-bit is like static-typing, or RDBMS RI, or "truly private" methods (to which I'm also opposed). Each of those enforces some past notion about that object/method/whatever on the present, even until it should no longer be the case -- at which point you simply go into the "meta settings" and turn it off because it got in your way.
Probably better to just avoid all that in the first place.
On Mon, Jun 11, 2012 at 2:42 PM, Chris Muller asqueaker@gmail.com wrote:
if you remember i said before, that i don't like immutability enforced through VM contract.
imagine two frameworks, using immutability flag for their own purposes, and contending for ownership of same object(s)..
IMO, there are other , better, solutions to that but i'm not going to go in details...
I totally agree with you here, too. An immutability-bit is like static-typing, or RDBMS RI, or "truly private" methods (to which I'm also opposed). Each of those enforces some past notion about that object/method/whatever on the present, even until it should no longer be the case -- at which point you simply go into the "meta settings" and turn it off because it got in your way.
Probably better to just avoid all that in the first place.
Ok, what if we call it a "fast write-barrier provided by the VM". Would that change your view? Igor?
Colin
On 11 June 2012 23:46, Colin Putney colin@wiresong.com wrote:
On Mon, Jun 11, 2012 at 2:42 PM, Chris Muller asqueaker@gmail.com wrote:
if you remember i said before, that i don't like immutability enforced through VM contract.
imagine two frameworks, using immutability flag for their own purposes, and contending for ownership of same object(s)..
IMO, there are other , better, solutions to that but i'm not going to go in details...
I totally agree with you here, too. An immutability-bit is like static-typing, or RDBMS RI, or "truly private" methods (to which I'm also opposed). Each of those enforces some past notion about that object/method/whatever on the present, even until it should no longer be the case -- at which point you simply go into the "meta settings" and turn it off because it got in your way.
Probably better to just avoid all that in the first place.
Ok, what if we call it a "fast write-barrier provided by the VM". Would that change your view? Igor?
The problem is that it is not nearly as fast: - at every memory write you have to check this flag, even if it is not used anywhere by anything at language side, you will still pay the price.
I would understand if we want to have a dedicated VM where this flag will serve only single purpose to support a specific software (like OODB) which is the only user of this flag.
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
- Bert -
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable? (I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
You are confusing something here. Become is just a "fancier form" of assignment that you could in fact write without a primitive by enumerating all the references to an object. If you keep that in mind it is obvious that since assigning an immutable *to* a variable is never a problem using become *with* an immutable as argument is neither since in both cases the immutable remains immutable.
The real question that arises is whether become should be allowed to change the *contents* of an immutable object. Personally, I think it should not, but this has some side effects that require fixing such as class migration which really should have a separate primitive to update its instances after reshape - the rules for this *should* include changing immutables unless you want to change the system to deal with multiple concurrent class versions (much pain down that path).
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Class migration should really have its own primitive. If it had, much pain could be avoided in migrating classes (see comments in ClassBuilder>>update:to:). And then one could decide on the proper policy to use for immutables.
Cheers, - Andreas
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
-- Best regards, Igor Stasenko.
On 13 June 2012 15:16, Andreas Raab Andreas.Raab@gmx.de wrote:
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
You are confusing something here. Become is just a "fancier form" of assignment that you could in fact write without a primitive by enumerating all the references to an object. If you keep that in mind it is obvious that since assigning an immutable *to* a variable is never a problem using become *with* an immutable as argument is neither since in both cases the immutable remains immutable.
But you just make it harder.. what prevents me from modifying all pointers to immutable(a) which holds a pointer to target immutable(t), by mutable object (b) with replaced references to mutable (t')?
The real question that arises is whether become should be allowed to change the *contents* of an immutable object. Personally, I think it should not, but this has some side effects that require fixing such as class migration which really should have a separate primitive to update its instances after reshape - the rules for this *should* include changing immutables unless you want to change the system to deal with multiple concurrent class versions (much pain down that path).
This is where i usually stop. IMO this is already showing too much costs (to my taste) of introducing such restrictions and maintaining them. I would ask first, is this worth an effort, if it so costly in terms of implementation? In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
For me, if we introduce some feature or want to enforce certain behavior, then there should be very strong reason for doing so, with benefits which clearly outweigh the costs and making some things easier to do, like with arbitrary properties slot - there's very little additional complexity in VM side, and very simple to use at language side, replacing many crippling schemes , like properties in Morphic , Dependents in Object etc. In contrast, an immutables adds another dimension of complexity to already complex soup in both VM and language sides, and benefits are not so clear, as to me.
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Class migration should really have its own primitive. If it had, much pain could be avoided in migrating classes (see comments in ClassBuilder>>update:to:). And then one could decide on the proper policy to use for immutables.
Cheers, - Andreas
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
-- Best regards, Igor Stasenko.
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
Igor wrote:
On 13 June 2012 15:16, Andreas Raab Andreas.Raab@gmx.de wrote:
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
You are confusing something here. Become is just a "fancier form" of
assignment that you could in fact write without a primitive by enumerating all the references to an object. If you keep that in mind it is obvious that since assigning an immutable *to* a variable is never a problem using become *with* an immutable as argument is neither since in both cases the immutable remains immutable.
But you just make it harder.. what prevents me from modifying all pointers to immutable(a) which holds a pointer to target immutable(t), by mutable object (b) with replaced references to mutable (t')?
Absolutely nothing. Why should it? Assuming that all your "pointers to immutable a" are mutable themselves you can absolutely change them. Go right ahead. You *can* replace an immutable by a mutable object, it's trivial:
value := Object new beImmutable. value := value asMutableCopy.
See, done. So where is the problem? Whether you write it like that or whether you write it like here:
value := Object new beImmutable. value := value become: value asMutableCopy.
is entirely irrelevant. It doesn't change the fact that the original object is still immutable after the operation and that's the only thing that counts for the VM operation. Of course the following would be somewhat different:
immutable := (#abc -> Object new) asImmutable. value := immutable value beImmutable. "making the value immutable" value := value become: value asMutableCopy.
In this case, the following assertions should hold:
self deny: value == immutable value. "assoc was frozen, no changing its value" self assert: value isMutable. self deny: immutable value isImmutable.
So this all works out just fine. BTW, I think that discussing the use of become for hacking into the system is beyond the scope of this discussion. Become is used in very few situations and its semantics are necessarily such that one can do bad things with it. Duly noted, let's move on.
The real question that arises is whether become should be allowed to
change the *contents* of an immutable object. Personally, I think it should not, but this has some side effects that require fixing such as class migration which really should have a separate primitive to update its instances after reshape - the rules for this *should* include changing immutables unless you want to change the system to deal with multiple concurrent class versions (much pain down that path).
This is where i usually stop. IMO this is already showing too much costs (to my taste) of introducing such restrictions and maintaining them. I would ask first, is this worth an effort, if it so costly in terms of implementation? In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find. But it also gives rise to many other interesting techniques (read-only transactions etc).
Cheers, - Andreas
For me, if we introduce some feature or want to enforce certain behavior, then there should be very strong reason for doing so, with benefits which clearly outweigh the costs and making some things easier to do, like with arbitrary properties slot
- there's very little additional complexity in VM side, and very
simple to use at language side, replacing many crippling schemes , like properties in Morphic , Dependents in Object etc. In contrast, an immutables adds another dimension of complexity to already complex soup in both VM and language sides, and benefits are not so clear, as to me.
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Class migration should really have its own primitive. If it had, much
pain could be avoided in migrating classes (see comments in ClassBuilder>>update:to:). And then one could decide on the proper policy to use for immutables.
Cheers, - Andreas
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
-- Best regards, Igor Stasenko.
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
Igor wrote:
On 13 June 2012 15:16, Andreas Raab Andreas.Raab@gmx.de wrote:
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
You are confusing something here. Become is just a "fancier form" of assignment that you could in fact write without a primitive by enumerating all the references to an object. If you keep that in mind it is obvious that since assigning an immutable *to* a variable is never a problem using become *with* an immutable as argument is neither since in both cases the immutable remains immutable.
But you just make it harder.. what prevents me from modifying all pointers to immutable(a) which holds a pointer to target immutable(t), by mutable object (b) with replaced references to mutable (t')?
Absolutely nothing. Why should it? Assuming that all your "pointers to immutable a" are mutable themselves you can absolutely change them. Go right ahead. You *can* replace an immutable by a mutable object, it's trivial:
value := Object new beImmutable. value := value asMutableCopy.
See, done. So where is the problem? Whether you write it like that or whether you write it like here:
value := Object new beImmutable. value := value become: value asMutableCopy.
is entirely irrelevant. It doesn't change the fact that the original object is still immutable after the operation and that's the only thing that counts for the VM operation. Of course the following would be somewhat different:
immutable := (#abc -> Object new) asImmutable. value := immutable value beImmutable. "making the value immutable" value := value become: value asMutableCopy.
In this case, the following assertions should hold:
self deny: value == immutable value. "assoc was frozen, no changing its value" self assert: value isMutable. self deny: immutable value isImmutable.
So this all works out just fine. BTW, I think that discussing the use of become for hacking into the system is beyond the scope of this discussion. Become is used in very few situations and its semantics are necessarily such that one can do bad things with it. Duly noted, let's move on.
The real question that arises is whether become should be allowed to change the *contents* of an immutable object. Personally, I think it should not, but this has some side effects that require fixing such as class migration which really should have a separate primitive to update its instances after reshape - the rules for this *should* include changing immutables unless you want to change the system to deal with multiple concurrent class versions (much pain down that path).
This is where i usually stop. IMO this is already showing too much costs (to my taste) of introducing such restrictions and maintaining them. I would ask first, is this worth an effort, if it so costly in terms of implementation? In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
This is the same scenario as if i build debug version of VM which has many assertions and additional checks enabled, so i can find what is going on. But when you deploy, you disable them for obvious reason - performance, and because they don't needed once you prove that all preconditions hold, no matter what your code does.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
But it also gives rise to many other interesting techniques (read-only transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Cheers, - Andreas
For me, if we introduce some feature or want to enforce certain behavior, then there should be very strong reason for doing so, with benefits which clearly outweigh the costs and making some things easier to do, like with arbitrary properties slot
- there's very little additional complexity in VM side, and very
simple to use at language side, replacing many crippling schemes , like properties in Morphic , Dependents in Object etc. In contrast, an immutables adds another dimension of complexity to already complex soup in both VM and language sides, and benefits are not so clear, as to me.
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Class migration should really have its own primitive. If it had, much pain could be avoided in migrating classes (see comments in ClassBuilder>>update:to:). And then one could decide on the proper policy to use for immutables.
Cheers, - Andreas
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
-- Best regards, Igor Stasenko.
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
-- NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
re immutability I'd like to point out that VisualWorks has had immutability for nearly seven years now, VisualAge has had immutability for even longer, and its proved straight-forward.
- literals are finally immutable, no accidentally changing literals and finding out that the application behaves inexplicably because the source no longer matches the method
- GemStone makes efficient use of the write barrier for its object paging facilities
- this is done with a framework that allows different applications to register managers for different objects so that the VM-level immutability support can be shared amongst different clients (GemStone, debug traps, etc)
So adding immutability isn't adding some radical poorly understood feature, but instead adding something with proven utility that adds safety.
In the new object format my design philosophy is similar to that of Cog. Be conservative; make sure the existing system is supported well. No experiments; this is a production VM (but that doesn't preclude supporting experiments, e.g. using Stefan's excellent idea of supporting an optional extra header word). Extend for performance; adding 64-bit support, immediate floats and immediate characters all improve performance, as does lazy become and the space and time savings due to the new object representation itself (class index instead of class reference).
None of this prevents folks from doing more radical things with it. But I'm not interested in implementing an experiment, I'm interested in engineering a significantly faster and more scaleable VM. Hence for me, forcing a become to add an identityHash is too experimental and risky. I know that the elements I've expressed in the design work; they're all ideas that other systems have used, even if lazy become hasn't been used in Smalltalk VMs before (AFAIA).
On Wed, Jun 13, 2012 at 9:01 AM, Igor Stasenko siguctua@gmail.com wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
Igor wrote:
On 13 June 2012 15:16, Andreas Raab Andreas.Raab@gmx.de wrote:
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
You are confusing something here. Become is just a "fancier form" of
assignment that you could in fact write without a primitive by enumerating all the references to an object. If you keep that in mind it is obvious that since assigning an immutable *to* a variable is never a problem using become *with* an immutable as argument is neither since in both cases the immutable remains immutable.
But you just make it harder.. what prevents me from modifying all pointers to immutable(a) which holds a pointer to target immutable(t), by mutable object (b) with replaced references to mutable (t')?
Absolutely nothing. Why should it? Assuming that all your "pointers to
immutable a" are mutable themselves you can absolutely change them. Go right ahead. You *can* replace an immutable by a mutable object, it's trivial:
value := Object new beImmutable. value := value asMutableCopy.
See, done. So where is the problem? Whether you write it like that or
whether you write it like here:
value := Object new beImmutable. value := value become: value asMutableCopy.
is entirely irrelevant. It doesn't change the fact that the original
object is still immutable after the operation and that's the only thing that counts for the VM operation. Of course the following would be somewhat different:
immutable := (#abc -> Object new) asImmutable. value := immutable value beImmutable. "making the
value immutable"
value := value become: value asMutableCopy.
In this case, the following assertions should hold:
self deny: value == immutable value. "assoc was frozen, no changing
its value"
self assert: value isMutable. self deny: immutable value isImmutable.
So this all works out just fine. BTW, I think that discussing the use of
become for hacking into the system is beyond the scope of this discussion. Become is used in very few situations and its semantics are necessarily such that one can do bad things with it. Duly noted, let's move on.
The real question that arises is whether become should be allowed to
change the *contents* of an immutable object. Personally, I think it should not, but this has some side effects that require fixing such as class migration which really should have a separate primitive to update its instances after reshape - the rules for this *should* include changing immutables unless you want to change the system to deal with multiple concurrent class versions (much pain down that path).
This is where i usually stop. IMO this is already showing too much costs (to my taste) of introducing such restrictions and maintaining them. I would ask first, is this worth an effort, if it so costly in terms of implementation? In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
I absolutely do. There were several situations (for example in Croquet
and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
This is the same scenario as if i build debug version of VM which has many assertions and additional checks enabled, so i can find what is going on. But when you deploy, you disable them for obvious reason - performance, and because they don't needed once you prove that all preconditions hold, no matter what your code does.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
But it also gives rise to many other interesting techniques (read-only
transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Cheers,
- Andreas
For me, if we introduce some feature or want to enforce certain behavior, then there should be very strong reason for doing so, with benefits which clearly outweigh the costs and making some things easier to do, like with arbitrary properties slot
- there's very little additional complexity in VM side, and very
simple to use at language side, replacing many crippling schemes , like properties in Morphic , Dependents in Object etc. In contrast, an immutables adds another dimension of complexity to already complex soup in both VM and language sides, and benefits are not so clear, as to me.
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Class migration should really have its own primitive. If it had, much
pain could be avoided in migrating classes (see comments in ClassBuilder>>update:to:). And then one could decide on the proper policy to use for immutables.
Cheers,
- Andreas
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
- Bert -
-- Best regards, Igor Stasenko.
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
-- NEU: FreePhone 3-fach-Flat mit kostenlosem Smartphone! Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a
-- Best regards, Igor Stasenko.
On 13 June 2012 20:46, Eliot Miranda eliot.miranda@gmail.com wrote:
None of this prevents folks from doing more radical things with it. But I'm not interested in implementing an experiment, I'm interested in engineering a significantly faster and more scaleable VM. Hence for me, forcing a become to add an identityHash is too experimental and risky. I know that the elements I've expressed in the design work; they're all ideas that other systems have used, even if lazy become hasn't been used in Smalltalk VMs before (AFAIA).
i refined my proposal. see my other mail where i propose to always have an extra slot right after header. (you commented a first part of it, but did not commented second)
Igor wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
I absolutely do. There were several situations (for example in Croquet
and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
First of all, the same can be said for array bounds checking. There are very good reasons to leave it on, and the very same reasons hold for immutability. But more importantly, testing is *never* done until "all bugs are found". Testing is done until boredom overcomes fear. And nothing more. You're expressing a rather academic perspective here.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
So what you're saying is that we should've just made our application simpler? Gee, I guess we never thought of that! Now I finally understand why Stefane is paying you the big bucks to work on Pharo :-) Seriously, Igor, academia is not becoming to you, you need to get back into the real world. Preferredly into a startup with real pressure to ship something. Because, right now you're mostly just talking out of your rear end. In particular considering that one thing that would have made our application considerably simpler would have been immutability.
But it also gives rise to many other interesting techniques (read-only
transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Well, by the end of the day everything is possible. Even implementing immutability without VM support, for example by copying all the classes and recompiling them to raise errors when stored into them. But the problems from that solution are manifold, from class hierarchy issues, to special VM objects etc. Once you go through the exercise (which I did once) you start thinking that it would be so easy and simple to implement immutability in the VM. And it's a fact that in our system having immutability would have made it simpler and faster. But then again, I have no skin in this game any longer, so take my comments with whatever amount of salt you'd like.
Cheers, - Andreas
On 2012-06-14, at 12:53, Andreas Raab wrote:
Igor wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
First of all, the same can be said for array bounds checking. There are very good reasons to leave it on, and the very same reasons hold for immutability. But more importantly, testing is *never* done until "all bugs are found". Testing is done until boredom overcomes fear. And nothing more. You're expressing a rather academic perspective here.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
So what you're saying is that we should've just made our application simpler? Gee, I guess we never thought of that! Now I finally understand why Stefane is paying you the big bucks to work on Pharo :-) Seriously, Igor, academia is not becoming to you, you need to get back into the real world. Preferredly into a startup with real pressure to ship something. Because, right now you're mostly just talking out of your rear end. In particular considering that one thing that would have made our application considerably simpler would have been immutability.
But it also gives rise to many other interesting techniques (read-only transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Well, by the end of the day everything is possible. Even implementing immutability without VM support, for example by copying all the classes and recompiling them to raise errors when stored into them. But the problems from that solution are manifold, from class hierarchy issues, to special VM objects etc. Once you go through the exercise (which I did once) you start thinking that it would be so easy and simple to implement immutability in the VM. And it's a fact that in our system having immutability would have made it simpler and faster. But then again, I have no skin in this game any longer, so take my comments with whatever amount of salt you'd like.
Cheers,
- Andreas
It should be possible to disagree and still keep the discussion civilized. Please?
I trust Eliot will take all this with a lump of his preferred mineral and come up with something good ;)
- Bert -
It should be possible to disagree and still keep the discussion civilized. Please?
My apologies. It wasn't meant to be uncivilized - just a bit of a gut reaction to "you know, you should just make your application a little simpler, then your need for immutability would go away" (*rolling my eyes*) I know comments like that from some of our so-called Engineering VPs in my last life and I might react a bit allergic to them. Apologies again.
Cheers, - Andreas
I trust Eliot will take all this with a lump of his preferred mineral and come up with something good ;)
- Bert -
On 14 June 2012 14:58, Andreas Raab Andreas.Raab@gmx.de wrote:
It should be possible to disagree and still keep the discussion civilized. Please?
My apologies. It wasn't meant to be uncivilized - just a bit of a gut reaction to "you know, you should just make your application a little simpler, then your need for immutability would go away" (*rolling my eyes*) I know comments like that from some of our so-called Engineering VPs in my last life and I might react a bit allergic to them. Apologies again.
I was trying to discuss a better solutions which may not require immutability. I did not wanted to teach you about programming whatever, but to point out that there is no silver bullet: a problems in your design won't magically disappear once you will have immutability.
But your reaction can be expressed as: (rolling eyes) what are you talking about?!?! there's nothing better. period (end rolling eyes). so i stand off.. i cannot continue discussing with such stubborn attitude.
Cheers, - Andreas
I trust Eliot will take all this with a lump of his preferred mineral and come up with something good ;)
- Bert -
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
(final comment)
On 14 June 2012 14:58, Andreas Raab Andreas.Raab@gmx.de wrote:
It should be possible to disagree and still keep the discussion
civilized. Please?
My apologies. It wasn't meant to be uncivilized - just a bit of a gut
reaction to "you know, you should just make your application a little simpler, then your need for immutability would go away" (*rolling my eyes*) I know comments like that from some of our so-called Engineering VPs in my last life and I might react a bit allergic to them. Apologies again.
I was trying to discuss a better solutions which may not require immutability. I did not wanted to teach you about programming whatever, but to point out that there is no silver bullet: a problems in your design won't magically disappear once you will have immutability.
But your reaction can be expressed as: (rolling eyes) what are you talking about?!?!
What my rolling eyes mean is that the argument is non sequitor. The need for immutability is independent of the premise about how complex the system is, therefore the the argument is a fallacy. Anyway, I'm done here. Over and out.
Cheers, - Andreas
there's nothing better. period (end rolling eyes). so i stand off.. i cannot continue discussing with such stubborn attitude.
Cheers, - Andreas
I trust Eliot will take all this with a lump of his preferred mineral
and come up with something good ;)
- Bert -
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
On 14 June 2012 18:59, Andreas Raab Andreas.Raab@gmx.de wrote:
(final comment)
On 14 June 2012 14:58, Andreas Raab Andreas.Raab@gmx.de wrote:
It should be possible to disagree and still keep the discussion civilized. Please?
My apologies. It wasn't meant to be uncivilized - just a bit of a gut reaction to "you know, you should just make your application a little simpler, then your need for immutability would go away" (*rolling my eyes*) I know comments like that from some of our so-called Engineering VPs in my last life and I might react a bit allergic to them. Apologies again.
I was trying to discuss a better solutions which may not require immutability. I did not wanted to teach you about programming whatever, but to point out that there is no silver bullet: a problems in your design won't magically disappear once you will have immutability.
But your reaction can be expressed as: (rolling eyes) what are you talking about?!?!
What my rolling eyes mean is that the argument is non sequitor.
I don't know what is sequitor, and i cannot find it in vocabulary/translator so i cannot understand.
The need for immutability is independent of the premise about how complex the system is, therefore the the argument is a fallacy. Anyway, I'm done here. Over and out.
agree, but this is exactly what i tried to show you when you stated: --------- The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find. --------- If something extremely hard to find -> i understand this as a complex system which hard to manage and reason about (otherwise why it would be extremely hard?)
So, as i understood, you advocating the need for immutability by demonstrating how it can help to find flaws in design in *complex* systems.. And i agree that it helps, but found this advocacy argument very weak, because immutability offers almost nothing in terms of having less complex systems. And especially, when "The need for immutability is independent of the premise about how complex the system is"
so why we need it, again? :)
Cheers, - Andreas
there's nothing better. period (end rolling eyes). so i stand off.. i cannot continue discussing with such stubborn attitude.
Cheers, - Andreas
I trust Eliot will take all this with a lump of his preferred mineral and come up with something good ;)
- Bert -
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
What i really don't understand is why my opponents readily want to sacrifice the performance in order to deal with consequences of having complex systems, when its hard to reason about it, and at same time completely opposed to proposal of adding features which will help to reduce complexity in a first place, like adding slot for having arbitrary properties.
Sounds, like you prefer to deal with consequences rather than with cause..
On Thu, Jun 14, 2012 at 11:17 AM, Igor Stasenko siguctua@gmail.com wrote:
What i really don't understand is why my opponents readily want to sacrifice the performance in order to deal with consequences of having complex systems, when its hard to reason about it, and at same time completely opposed to proposal of adding features which will help to reduce complexity in a first place, like adding slot for having arbitrary properties.
You;re putting up a straw man. You *think* performance of immutability is an issue, but my experience tells me it isn't. I've implemented it before. So please stop raising an invalid objection.
Sounds, like you prefer to deal with consequences rather than with cause..
Bah, humbug.
-- Best regards, Igor Stasenko.
On 14 June 2012 23:47, Eliot Miranda eliot.miranda@gmail.com wrote:
On Thu, Jun 14, 2012 at 11:17 AM, Igor Stasenko siguctua@gmail.com wrote:
What i really don't understand is why my opponents readily want to sacrifice the performance in order to deal with consequences of having complex systems, when its hard to reason about it, and at same time completely opposed to proposal of adding features which will help to reduce complexity in a first place, like adding slot for having arbitrary properties.
You;re putting up a straw man. You *think* performance of immutability is an issue, but my experience tells me it isn't. I've implemented it before. So please stop raising an invalid objection.
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable. So why mine is invalid while yours are valid? So, i playing the same game here, by using the same argument.
Of course i can just shut up.. but then tell me to shut up , but don't say that i raising invalid objections.
Hello
2012/6/15 Igor Stasenko siguctua@gmail.com
On 14 June 2012 23:47, Eliot Miranda eliot.miranda@gmail.com wrote:
On Thu, Jun 14, 2012 at 11:17 AM, Igor Stasenko siguctua@gmail.com
wrote:
What i really don't understand is why my opponents readily want to sacrifice the performance in order to deal with consequences of having complex systems, when its hard to reason about it, and at same time completely opposed to proposal of adding features which will help to reduce complexity in a first place, like adding slot for having arbitrary properties.
You;re putting up a straw man. You *think* performance of immutability
is an issue, but my experience tells me it isn't. I've implemented it before. So please stop raising an invalid objection.
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
Do you try your scheduling implementation on Cog? Very interesting what difference in performance
On 15 June 2012 09:11, Denis Kudriashov dionisiydk@gmail.com wrote:
Hello
2012/6/15 Igor Stasenko siguctua@gmail.com
On 14 June 2012 23:47, Eliot Miranda eliot.miranda@gmail.com wrote:
On Thu, Jun 14, 2012 at 11:17 AM, Igor Stasenko siguctua@gmail.com wrote:
What i really don't understand is why my opponents readily want to sacrifice the performance in order to deal with consequences of having complex systems, when its hard to reason about it, and at same time completely opposed to proposal of adding features which will help to reduce complexity in a first place, like adding slot for having arbitrary properties.
You;re putting up a straw man. You *think* performance of immutability is an issue, but my experience tells me it isn't. I've implemented it before. So please stop raising an invalid objection.
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
Do you try your scheduling implementation on Cog? Very interesting what difference in performance
I'm not sure, but if i remember correctly , Cog puts even more assumptions (read 'contracts') about Processes and scheduling. Though , it is still could be possible, but it is hard to find motivation to do that, when others don't see the real value of it, and instead throwing it out basing on "unproven" argument.
It is same thing again, i trying to swim against the flow and reduce the responsibilities of VM, while most of people doing completely opposite: adding new and new contracts between VM and language, considered a good thing (tm).
The more early bound semantics you put into language, the more static it is , and if we won't stop going that way, one day we will find ourselves programming in C , just using "exotic" smalltalk syntax. So, why not just leave it alone and switch to C? I know sometimes it is very hard to provide some functionality without using early binding hammer, and reasons could be, of course, performance and complexity. Let me remind you that "Hardware is really just software crystallized early." And what is Virtual Machine to language, if not hardware (the Machine word speaks for itself)? That's why, we should think twice before crystallizing something, because then it is much harder to change and evolve, once it carved in stone.
That's why i raising questions about immutability, in particular. Because my view on the role of VM is to be a servant who can optimize certain things, but not stand in your way being a dictator who tells you what is possible and what's not.
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
I think you might be misremembering. IIRC, the real argument was the risk (albeit expressed as performance concerns) of replacing VM-level scheduling by image-level scheduling without further ado. I don't recall that anyone had an objection to image-level scheduling as an option in addition to VM-level scheduling. In which case one can experiment with the implications and learn from the change in the environment without necessarily committing the production systems to an unproven feature.
Cheers, - Andreas
On 15 June 2012 10:48, Andreas Raab Andreas.Raab@gmx.de wrote:
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
I think you might be misremembering. IIRC, the real argument was the risk (albeit expressed as performance concerns) of replacing VM-level scheduling by image-level scheduling without further ado. I don't recall that anyone had an objection to image-level scheduling as an option in addition to VM-level scheduling. In which case one can experiment with the implications and learn from the change in the environment without necessarily committing the production systems to an unproven feature.
But that's what i did, i made VM which remain compatible to existing scheduling policy, but adds a way to have a language-side scheduling, you simply switching the scheduler object, and voila, you got an image-side scheduling. So, if you not sure about unproven part, you can keep running using a "proven" one..
I am sure you know, that you cannot make an omelet without breaking an egg. Needless to say, (and i'm not gonna to repeat it here), what benefits providing a language-side scheduling comparing to hardcoded semantics which you are forced to rely on.
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
I think you might be misremembering. IIRC, the real argument was the
risk (albeit expressed as performance concerns) of replacing VM-level scheduling by image-level scheduling without further ado. I don't recall that anyone had an objection to image-level scheduling as an option in addition to VM-level scheduling. In which case one can experiment with the implications and learn from the change in the environment without necessarily committing the production systems to an unproven feature.
But that's what i did, i made VM which remain compatible to existing scheduling policy, but adds a way to have a language-side scheduling, you simply switching the scheduler object, and voila, you got an image-side scheduling. So, if you not sure about unproven part, you can keep running using a "proven" one..
Right. I misremembered the implementation. But when I was re-reading the discussion just now, there was not a single voice being raised against your proposal. Not one. There was only applause and encouragement and excitement. People did comment on performance but in no way rejecting the implementation for it. Here is what searched for:
http://www.google.com/#q=%22new+scheduler%22+site:lists.squeakfoundation.org
Is it perhaps possible that you simply dropped forgot to push it all the way through to VMMaker? My search does not find any results on VM-dev which to me indicates that the code was probably never "on the table" for inclusion.
Cheers, - Andreas
On 15 June 2012 17:26, Andreas Raab Andreas.Raab@gmx.de wrote:
Remember, what i have been told when i implemented a language-side scheduling, removing the need of VM to even know that is Semaphore? I been told *it is slow*. And this was the *only* argument against it, why it is found unacceptable.
I think you might be misremembering. IIRC, the real argument was the risk (albeit expressed as performance concerns) of replacing VM-level scheduling by image-level scheduling without further ado. I don't recall that anyone had an objection to image-level scheduling as an option in addition to VM-level scheduling. In which case one can experiment with the implications and learn from the change in the environment without necessarily committing the production systems to an unproven feature.
But that's what i did, i made VM which remain compatible to existing scheduling policy, but adds a way to have a language-side scheduling, you simply switching the scheduler object, and voila, you got an image-side scheduling. So, if you not sure about unproven part, you can keep running using a "proven" one..
Right. I misremembered the implementation. But when I was re-reading the discussion just now, there was not a single voice being raised against your proposal. Not one. There was only applause and encouragement and excitement. People did comment on performance but in no way rejecting the implementation for it. Here is what searched for:
http://www.google.com/#q=%22new+scheduler%22+site:lists.squeakfoundation.org
Is it perhaps possible that you simply dropped forgot to push it all the way through to VMMaker? My search does not find any results on VM-dev which to me indicates that the code was probably never "on the table" for inclusion.
Perhaps i misinterpret the feedback of my work.. (i will take time to reread the mail exchange) But i stopped mainly because it was done and ready for integration. The VM was working and everything was there.
Why i did not pushed it to VMMaker? Because, i expected a _political_ decision whether we do it or not, and that decision were not made. And for sure, i never assuming that am in a position to force certain changes to VM, without asking community about it. And since my choice was obvious (otherwise why i spent time implementing all of it?), a final decision apparently cannot be not on my side.
So, IIRC, i didn't saw any response like: "Yes we should integrate that" what i seen is "yeah it _could_ be an option". I am not very good in english, but certainly, i can make a difference between _could_ and _must_.
And of course, at any moment i can revive this work and synchronize it with Cog VM, of course if there's a strong opinion of majority, that this is the things we want to have, but not the "well, maybe, perhaps, one day.. when you walk over my dead body" :)
Cheers, - Andreas
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
On 6/14/12, Igor Stasenko siguctua@gmail.com wrote:
On 14 June 2012 18:59, Andreas Raab Andreas.Raab@gmx.de wrote:
(final comment)
On 14 June 2012 14:58, Andreas Raab Andreas.Raab@gmx.de wrote:
It should be possible to disagree and still keep the discussion civilized. Please?
My apologies. It wasn't meant to be uncivilized - just a bit of a gut reaction to "you know, you should just make your application a little simpler, then your need for immutability would go away" (*rolling my eyes*) I know comments like that from some of our so-called Engineering VPs in my last life and I might react a bit allergic to them. Apologies again.
I was trying to discuss a better solutions which may not require immutability. I did not wanted to teach you about programming whatever, but to point out that there is no silver bullet: a problems in your design won't magically disappear once you will have immutability.
But your reaction can be expressed as: (rolling eyes) what are you talking about?!?!
What my rolling eyes mean is that the argument is non sequitor.
I don't know what is sequitor, and i cannot find it in vocabulary/translator so i cannot understand.
Maybe you can use Google. For me it is the third entry :-)
http://en.wikipedia.org/wiki/Non_sequitur_%28logic%29
HJH
The need for immutability is independent of the premise about how complex the system is, therefore the the argument is a fallacy. Anyway, I'm done here. Over and out.
agree, but this is exactly what i tried to show you when you stated:
The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
If something extremely hard to find -> i understand this as a complex system which hard to manage and reason about (otherwise why it would be extremely hard?)
So, as i understood, you advocating the need for immutability by demonstrating how it can help to find flaws in design in *complex* systems.. And i agree that it helps, but found this advocacy argument very weak, because immutability offers almost nothing in terms of having less complex systems. And especially, when "The need for immutability is independent of the premise about how complex the system is"
so why we need it, again? :)
Cheers,
- Andreas
there's nothing better. period (end rolling eyes). so i stand off.. i cannot continue discussing with such stubborn attitude.
Cheers,
- Andreas
I trust Eliot will take all this with a lump of his preferred mineral and come up with something good ;)
- Bert -
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
On 14 June 2012 12:53, Andreas Raab Andreas.Raab@gmx.de wrote:
Igor wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
First of all, the same can be said for array bounds checking. There are very good reasons to leave it on, and the very same reasons hold for immutability. But more importantly, testing is *never* done until "all bugs are found". Testing is done until boredom overcomes fear. And nothing more. You're expressing a rather academic perspective here.
Sure. Because we're talking about object format design. I just not sure that benefits of immutability so great that they outweigh the performance overhead. And i thinking we should not stop looking for better solutions.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
So what you're saying is that we should've just made our application simpler? Gee, I guess we never thought of that! Now I finally understand why Stefane is paying you the big bucks to work on Pharo :-)
How nice of you, Andreas. You had bad sleep? I just said that using immutability to _detect_ problems in your software is not the same as using immutablity to address certain problems.
Seriously, Igor, academia is not becoming to you, you need to get back into the real world. Preferredly into a startup with real pressure to ship something.
Yeah, a good example of such work will be JavaScript, which made in two weeks. Nice suggestion, but thanks. No.
Trust me, i worked long enough in industry and know what you talking about. Now i don't see any parallels to what we discussing here. It is completely orthogonal. There's enormous amount of crap in software produced because of such approach. And Squeak and it's VM is not an exception. If you think that immutability can help you producing less crappy software, then i afraid i have to disappoint you: no, the amount of crap will remain the same, and even increase. Because the more concepts you putting into domain, the more complex it becomes, and in a non-linear progression, for sure.
If we're want to improve that, then it is better to take a time, think twice before cutting (and yeah, read some hatred academia papers, if you like to know if there other/better solutions exist to your problems, because (what a surprise!!) many people meet the same problems as you do).
Because, right now you're mostly just talking out of your rear end. In particular considering that one thing that would have made our application considerably simpler would have been immutability.
Care to share some practical example? Because to me your attitude to this flag sounds like "solve all my problems = 1" in object header.
But it also gives rise to many other interesting techniques (read-only transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Well, by the end of the day everything is possible. Even implementing immutability without VM support, for example by copying all the classes and recompiling them to raise errors when stored into them. But the problems from that solution are manifold, from class hierarchy issues, to special VM objects etc. Once you go through the exercise (which I did once) you start thinking that it would be so easy and simple to implement immutability in the VM. And it's a fact that in our system having immutability would have made it simpler and faster.
That's what i thinking too. But why you stop there and don't think further: what if there a solution which can make things even simpler and don't require immutability? This where i trying to drive a discussion. Now, if you keep insisting that immutability solves all your problems, and you don't care to think about better alternatives (because you have a boss with a whip who will execute you if you don't deliver in time) then there's nothing to discuss.
But then again, I have no skin in this game any longer, so take my comments with whatever amount of salt you'd like.
Cheers, - Andreas
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
On 14 Jun 2012, at 18:03, Igor Stasenko wrote:
Sure. Because we're talking about object format design. I just not sure that benefits of immutability so great that they outweigh the performance overhead. And i thinking we should not stop looking for better solutions.
If performance is your main concern, then do not implement it with a header bit, but use what your hardware provides you with: memory protection.
That will make your GC more complex, but if you move all protected objects onto the same page, you can make it read and/or write protected. In case it is only write protected, reads will be as fast as for every other object.
If you write to it, you'll have to catch a signal, and can propagate it to the language level. To write such an object, just map the same page again at another address, with a nice offset and redirect the writes there.
I don't think such a solution will make the VM maintainers lives a lot easier, but at least you wont have to suffer your performance hit. And you will not actually need a header bit either...
More inspiration here:
Ribbons: a Partially Shared Memory Programming Model http://www.cs.purdue.edu/homes/peugster/Ribbons/RJ.pdf
Best regards Stefan
On Thu, Jun 14, 2012 at 9:03 AM, Igor Stasenko siguctua@gmail.com wrote:
On 14 June 2012 12:53, Andreas Raab Andreas.Raab@gmx.de wrote:
Igor wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
I absolutely do. There were several situations (for example in Croquet
and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
First of all, the same can be said for array bounds checking. There are
very good reasons to leave it on, and the very same reasons hold for immutability. But more importantly, testing is *never* done until "all bugs are found". Testing is done until boredom overcomes fear. And nothing more. You're expressing a rather academic perspective here.
Sure. Because we're talking about object format design. I just not sure that benefits of immutability so great that they outweigh the performance overhead.
But so you know what the performance overhead is? In VisualWorks it was < 5% over a wide range of benchmarks (e.g. real benchmarks such as recompile the system).
And i thinking we should not stop looking for better solutions.
Of course. But for now immutability is in because I, and many others with real experience in the Smalltalk world, think it more than pays its way.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
So what you're saying is that we should've just made our application
simpler? Gee, I guess we never thought of that! Now I finally understand why Stefane is paying you the big bucks to work on Pharo :-)
How nice of you, Andreas. You had bad sleep? I just said that using immutability to _detect_ problems in your software is not the same as using immutablity to address certain problems.
Seriously, Igor, academia is not becoming to you, you need to get back
into the real world. Preferredly into a startup with real pressure to ship something.
Yeah, a good example of such work will be JavaScript, which made in two weeks. Nice suggestion, but thanks. No.
Trust me, i worked long enough in industry and know what you talking about. Now i don't see any parallels to what we discussing here. It is completely orthogonal. There's enormous amount of crap in software produced because of such approach. And Squeak and it's VM is not an exception. If you think that immutability can help you producing less crappy software, then i afraid i have to disappoint you: no, the amount of crap will remain the same, and even increase. Because the more concepts you putting into domain, the more complex it becomes, and in a non-linear progression, for sure.
Not so. Just avoiding updating supposedly immutable literals is a win. When immutability was added to VisualWorks many errors in the base classes were found. The most typical being related to ^'' writeStream.
If we're want to improve that, then it is better to take a time, think twice before cutting (and yeah, read some hatred academia papers, if you like to know if there other/better solutions exist to your problems, because (what a surprise!!) many people meet the same problems as you do).
Because, right now you're mostly just talking out of your rear end. In
particular considering that one thing that would have made our application considerably simpler would have been immutability.
Care to share some practical example? Because to me your attitude to this flag sounds like "solve all my problems = 1" in object header.
But it also gives rise to many other interesting techniques (read-only
transactions etc).
Look, we have completely open system in our hands. Do you think it is impossible to implement certain things w/o immutability?
Well, by the end of the day everything is possible. Even implementing
immutability without VM support, for example by copying all the classes and recompiling them to raise errors when stored into them. But the problems from that solution are manifold, from class hierarchy issues, to special VM objects etc. Once you go through the exercise (which I did once) you start thinking that it would be so easy and simple to implement immutability in the VM. And it's a fact that in our system having immutability would have made it simpler and faster.
That's what i thinking too. But why you stop there and don't think further: what if there a solution which can make things even simpler and don't require immutability? This where i trying to drive a discussion. Now, if you keep insisting that immutability solves all your problems, and you don't care to think about better alternatives (because you have a boss with a whip who will execute you if you don't deliver in time) then there's nothing to discuss.
But then again, I have no skin in this game any longer, so take my
comments with whatever amount of salt you'd like.
Cheers,
- Andreas
-- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
-- Best regards, Igor Stasenko.
On 14 June 2012 23:44, Eliot Miranda eliot.miranda@gmail.com wrote:
On Thu, Jun 14, 2012 at 9:03 AM, Igor Stasenko siguctua@gmail.com wrote:
On 14 June 2012 12:53, Andreas Raab Andreas.Raab@gmx.de wrote:
Igor wrote:
On 13 June 2012 16:32, Andreas Raab Andreas.Raab@gmx.de wrote:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find.
I finding this very weak argument. You are talking about developing project and bug fixing.. Yes i agree immutability can be useful for debugging. But for deployed & well tested applications? Once you found all bugs, and deploy your application, do you really think it is worth paying the price for checking all write operations, when you already made sure that your app will behave correctly?
First of all, the same can be said for array bounds checking. There are very good reasons to leave it on, and the very same reasons hold for immutability. But more importantly, testing is *never* done until "all bugs are found". Testing is done until boredom overcomes fear. And nothing more. You're expressing a rather academic perspective here.
Sure. Because we're talking about object format design. I just not sure that benefits of immutability so great that they outweigh the performance overhead.
But so you know what the performance overhead is? In VisualWorks it was < 5% over a wide range of benchmarks (e.g. real benchmarks such as recompile the system).
Yes i remember this number.
But, lets pose the *right* question: - guys we have extra 5% of performance to waste, what feature you would like to spend it on?
Are you sure that immutability will be absolute winner?
And i thinking we should not stop looking for better solutions.
Of course. But for now immutability is in because I, and many others with real experience in the Smalltalk world, think it more than pays its way.
As for *extremely* hard to find, i think first thing which you should address in such cases is the complexity of your application, to always be able to reason about it's behavior and make sure it behaves "deterministically", because if software grows too large up to the point that you need such kind of crutches to figure out what's happening, immutability alone doesn't solve your problems, it can only indicate that you have a problems with your design.
So what you're saying is that we should've just made our application simpler? Gee, I guess we never thought of that! Now I finally understand why Stefane is paying you the big bucks to work on Pharo :-)
How nice of you, Andreas. You had bad sleep? I just said that using immutability to _detect_ problems in your software is not the same as using immutablity to address certain problems.
Seriously, Igor, academia is not becoming to you, you need to get back into the real world. Preferredly into a startup with real pressure to ship something.
Yeah, a good example of such work will be JavaScript, which made in two weeks. Nice suggestion, but thanks. No.
Trust me, i worked long enough in industry and know what you talking about. Now i don't see any parallels to what we discussing here. It is completely orthogonal. There's enormous amount of crap in software produced because of such approach. And Squeak and it's VM is not an exception. If you think that immutability can help you producing less crappy software, then i afraid i have to disappoint you: no, the amount of crap will remain the same, and even increase. Because the more concepts you putting into domain, the more complex it becomes, and in a non-linear progression, for sure.
Not so. Just avoiding updating supposedly immutable literals is a win. When immutability was added to VisualWorks many errors in the base classes were found. The most typical being related to ^'' writeStream.
Yes, this is what Andreas says. And so, i don't buy it (not because it from Andreas, for sure ;).
Let me repeat same silly, bully, idiotic question again: suppose you found all those errors and fixed them. What next? Your system runs well, doing everything correctly, and... still keep paying the price of extra check per write?
I see two major areas where immutability is useful: A) to detect & fix bugs, like you & Andreas described B) to use it for implementing certain features
A. Yes , it's useful. No doubt. But for sure, no doubt, one can detect such kinds of errors (as well as many others) by running his software/system in "supervision" mode (like VMMaker simulating VM). As well as writing proper tests can help in that regard too. The only difference by making immutability an inherent property of a system, is that you testing testing, testing every time.. Even already tested and proven to work software. Is it only to me it sounds like a waste? You don't run tests when you deployed application, isn't? You testing & fixing it during development, then you ship it, and it works. No overhead to pay. If immutability would be the *only* possible solution to these problems, i would vote with both hands for it. You may answer: yeah, of course we can do it other way, but we just too lazy to test the software we write in a first place, so let it be like that. And that will be the right answer to that :)
B. Most of the software don't needs immutability for implementing its features. Fact. A working Squeak-family systems backing up this fact just well. So, when/where you need it? In some specialized applications like OODBs.. And it is not because it is not available *yet*, because then, immutability would be there much much earlier. Marking literals as read-only is not a *feature*, it is just protection from fools.
So, like it or not, i don't consider the impact of having immutability so significant, that it worth paying 5% of performance.
And besides of that, why flag? If we can use the hardware to protect from memory writes (like Stefan mentioned) so, there will be no overhead, and then i will say not a single word against it. If you remember, i proposed before, to develop a segmented object memory model, so you can split a memory on segments, which could serve many different purposes (again not just to solely support immutability): you could have a read-only segments, you could have segments for pinned objects etc, and apply different strategies for working with object residing there. And i bet some work already done in that regard by others, so it is hardly an *experimental* , *exotic* or *academic* area (the *-* words are sorted in order of scaryness ;) ).
This is a feature(s), which i would like to have and ready to pay the price. But just immutability alone? Nope.
btw about memory segments..
look, we have them already: - old space - new space - JIT space - stack
so, all we need is to crystallize it into generic concept , and so we can add:
- read-only space - pinned space - whatever space
For to me it is a logical step, what everyone should do, once he sees a common pattern. We should generalize that. And in contrary to Stefan's fears i think this will lead us to better levels of quality in VM design.
On Fri, Jun 15, 2012 at 01:15:15AM +0200, Igor Stasenko wrote:
On 14 June 2012 23:44, Eliot Miranda eliot.miranda@gmail.com wrote:
But so you know what the performance overhead is? ??In VisualWorks it was < 5% over a wide range of benchmarks (e.g. real benchmarks such as recompile the system).
Yes i remember this number.
But, lets pose the *right* question:
- guys we have extra 5% of performance to waste,
what feature you would like to spend it on?
It seems to me that some of the topics being discussed here might benefit from empirical measurements. For example, a particular object header format might have an effect on the speed of walking through the object memory heap. In such a case, it would be relatively easy to take measurements to estimate the actual performance that might be expected without actually doing a complete implementation. The process might look something like this:
- Trace an image to the desired new format. Do not worry about making it functional, just make it good enough that objects in the traced image can be reached with #objectAfter:
- Hack the interpreter code just enough to make #objectAfter: work with the new object format. Do not worry about making a functional VM.
- Generate a new "VM" that gets just far enough to load the image into an object memory. Turn off inlining when generating the "VM" so that a profiler can see where the time is being spent. Arrange for the "VM" to go immediately into a loop that walks through the object memory from firstAccessibleObject to end of memory (see StackInterpreter>>allAccessibleObjectsOkay for example). Loop for a few thousand iterations through the heap, then exit.
- Profile it (gprof). See where the time goes and how long it takes to run to completion.
- Compare the results obtained with various object header strategies.
Experiments like this might produce some very useful insights. It would be a lot of work for one person to do all the research on this, but it seems to me that we now have quite a few interested and competent VM hackers who might be willing to pitch in to obtain empirical data.
Any interest?
Dave
Hi dave
I asked camillo to do that a week ago on a similar topics. So we will see.
Stef
On Jun 15, 2012, at 2:49 AM, David T. Lewis wrote:
On Fri, Jun 15, 2012 at 01:15:15AM +0200, Igor Stasenko wrote:
On 14 June 2012 23:44, Eliot Miranda eliot.miranda@gmail.com wrote:
But so you know what the performance overhead is? ??In VisualWorks it was < 5% over a wide range of benchmarks (e.g. real benchmarks such as recompile the system).
Yes i remember this number.
But, lets pose the *right* question:
- guys we have extra 5% of performance to waste,
what feature you would like to spend it on?
It seems to me that some of the topics being discussed here might benefit from empirical measurements. For example, a particular object header format might have an effect on the speed of walking through the object memory heap. In such a case, it would be relatively easy to take measurements to estimate the actual performance that might be expected without actually doing a complete implementation. The process might look something like this:
- Trace an image to the desired new format. Do not worry about making it
functional, just make it good enough that objects in the traced image can be reached with #objectAfter:
- Hack the interpreter code just enough to make #objectAfter: work with the
new object format. Do not worry about making a functional VM.
- Generate a new "VM" that gets just far enough to load the image into an
object memory. Turn off inlining when generating the "VM" so that a profiler can see where the time is being spent. Arrange for the "VM" to go immediately into a loop that walks through the object memory from firstAccessibleObject to end of memory (see StackInterpreter>>allAccessibleObjectsOkay for example). Loop for a few thousand iterations through the heap, then exit.
- Profile it (gprof). See where the time goes and how long it takes to run
to completion.
- Compare the results obtained with various object header strategies.
Experiments like this might produce some very useful insights. It would be a lot of work for one person to do all the research on this, but it seems to me that we now have quite a few interested and competent VM hackers who might be willing to pitch in to obtain empirical data.
Any interest?
Dave
Igor wrote:
In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
Andreas responded:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find. But it also gives rise to many other interesting techniques (read-only transactions etc).
I have to agree with Andreas here. I've used immutability to good effect in VisualWorks, and run into several situations where I wanted it in Squeak and had to make do with a work around. Avi's WriteBarrier package is one of those workarounds.
To be honest, I'm a little puzzled by the resistance to immutability. It's not like this is a new idea with unclear semantics or unknown utility. The VisualWorks implementation is simple and useful. VisualWorks is still the fastest Smalltalk around, so the performance impact can't be *too* high. VisualWorks is fairly similar to Squeak, so it's not unreasonable to use it as a model for considering immutability in Squeak.
Now, maybe there really are problems with the VW, VA or Gemstone implementations, and if so, let's hear about them. "It's not necessary" isn't a good argument, because nothing is really necessary—assembly works fine. What are the costs, benefits and trade-offs of VM-supported immutability?
Colin
On 14 June 2012 20:31, Colin Putney colin@wiresong.com wrote:
Igor wrote:
In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
Andreas responded:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find. But it also gives rise to many other interesting techniques (read-only transactions etc).
I have to agree with Andreas here. I've used immutability to good effect in VisualWorks, and run into several situations where I wanted it in Squeak and had to make do with a work around. Avi's WriteBarrier package is one of those workarounds.
To be honest, I'm a little puzzled by the resistance to immutability.
Well, my main take against it, that smalltalk lived quite well without it so far. If it would be so crucial, then it would be there long ago, (even back in ST'80) isn't?
From a purist's perspective, it also doesn't looks like a good move.
Because originally we start from quite simple and universal: - everything is an object then we added - all objects are instances of some class
but now we adding: all objects *also* have identity hash all objects *also* have immutability bit
today we have few of those *also's* , tomorrow we will find the need for 50 more, because there's virtually no limit about what additional properties we may want to have per object..
And so, an only logical step, as to me was to do it for real: I proposed to add a slot for arbitrary properties per object, is simply a unification of the above, so you can state: - all objects *also* have at least one property slot, which can be used by language to store arbitrary properties per object (or at least one), *without* a need from object in question to have a direct knowledge about the nature of these properties (in contrast to instance variables, immutability, hash etc).
That also means that VM don't needs to know anything about those properties as well, which directly converts into less code and less complexity.
It's not like this is a new idea with unclear semantics or unknown utility. The VisualWorks implementation is simple and useful. VisualWorks is still the fastest Smalltalk around, so the performance impact can't be *too* high. VisualWorks is fairly similar to Squeak, so it's not unreasonable to use it as a model for considering immutability in Squeak.
Now, maybe there really are problems with the VW, VA or Gemstone implementations, and if so, let's hear about them. "It's not necessary" isn't a good argument, because nothing is really necessary—assembly works fine. What are the costs, benefits and trade-offs of VM-supported immutability?
For me the main cost is, of course, performance. If we take our existing images (which have no notion about immutability) they will perform slower on same hardware. We can turn this cons into pros, only when we will start using immutability around the places.. and here is where fun stuff begins: you cannot manage immutability state over more than ONE framework. So you will have to invent more complex schemes like mentioned "immutability managers" etc..
Anyways, the fact is, that most software will still won't use it anyways, but you will keep paying the price of having it. It's like introducing a new tax "to fund an activities to keep homeowner's lawns green and clean", and taking it from everyone, even from those who don't have own lawn and even don't own the house :)
Colin
On Thu, Jun 14, 2012 at 12:14 PM, Igor Stasenko siguctua@gmail.com wrote:
And so, an only logical step, as to me was to do it for real: I proposed to add a slot for arbitrary properties per object, is simply a unification of the above, so you can state:
- all objects *also* have at least one property slot, which can be
used by language to store arbitrary properties per object (or at least one), *without* a need from object in question to have a direct knowledge about the nature of these properties (in contrast to instance variables, immutability, hash etc).
Sure. Sounds good. I'd like to see that as well.
For me the main cost is, of course, performance. If we take our existing images (which have no notion about immutability) they will perform slower on same hardware.
Not true, actually. We're talking about adding an immutability bit to the new object format. Existing images won't run at all on a VM that uses the new format. Once those images are converted to the new format they'll run faster, with or without the immutability bit. That's not to say there's no performance impact, but come on, nobody is going to see performance worsen.
We can turn this cons into pros, only when we will start using immutability around the places.. and here is where fun stuff begins: you cannot manage immutability state over more than ONE framework. So you will have to invent more complex schemes like mentioned "immutability managers" etc..
That's theoretically a problem, yes. I've never seen it actually come up, though, and I have a hard time imagining a situation where you'd want to manage the state of a single object with more than one framework at a time.
Colin
On Thu, Jun 14, 2012 at 12:14 PM, Igor Stasenko siguctua@gmail.com wrote:
On 14 June 2012 20:31, Colin Putney colin@wiresong.com wrote:
Igor wrote:
In theory, of course we can enforce anything, but in practice that means a lot of complex code with many checks at VM side.. This is not what i like to see in a first place, especially knowing that Squeak lived well so far without any immutability and it does not feels like we miss it badly.
Andreas responded:
I absolutely do. There were several situations (for example in Croquet and at Teleplace) where we changed our designs to the worse merely due to the lack of immutability support. The main thing that immutability fixes is to prevent accidental modifications of objects thought to be immutable (method literals for example), which when they happen are *extremely* hard to find. But it also gives rise to many other interesting techniques (read-only transactions etc).
I have to agree with Andreas here. I've used immutability to good effect in VisualWorks, and run into several situations where I wanted it in Squeak and had to make do with a work around. Avi's WriteBarrier package is one of those workarounds.
To be honest, I'm a little puzzled by the resistance to immutability.
Well, my main take against it, that smalltalk lived quite well without it so far.
But it *hasn't*, as evinced by the bugs in the base libraries surfaced by adding immutablity to VisualWorks (and no doubt to VisualAge as well).
If it would be so crucial, then it would be there long ago, (even back in ST'80) isn't?
On a 16-bit machine with 32k objects there were many things that were lived without. Would you have us do without ensure: and exceptions and tuples and closures just because Smalltalk-80 didn't have them? What a silly argument.
From a purist's perspective, it also doesn't looks like a good move.
Because originally we start from quite simple and universal:
- everything is an object
then we added
- all objects are instances of some class
but now we adding: all objects *also* have identity hash all objects *also* have immutability bit
today we have few of those *also's* , tomorrow we will find the need for 50 more, because there's virtually no limit about what additional properties we may want to have per object..
And so, an only logical step, as to me was to do it for real: I proposed to add a slot for arbitrary properties per object, is simply a unification of the above, so you can state:
- all objects *also* have at least one property slot, which can be
used by language to store arbitrary properties per object (or at least one), *without* a need from object in question to have a direct knowledge about the nature of these properties (in contrast to instance variables, immutability, hash etc).
That also means that VM don't needs to know anything about those properties as well, which directly converts into less code and less complexity.
It's not like this is a new idea with unclear semantics or unknown utility. The VisualWorks implementation is simple and useful. VisualWorks is still the fastest Smalltalk around, so the performance impact can't be *too* high. VisualWorks is fairly similar to Squeak, so it's not unreasonable to use it as a model for considering immutability in Squeak.
Now, maybe there really are problems with the VW, VA or Gemstone implementations, and if so, let's hear about them. "It's not necessary" isn't a good argument, because nothing is really necessary—assembly works fine. What are the costs, benefits and trade-offs of VM-supported immutability?
For me the main cost is, of course, performance. If we take our existing images (which have no notion about immutability) they will perform slower on same hardware. We can turn this cons into pros, only when we will start using immutability around the places.. and here is where fun stuff begins: you cannot manage immutability state over more than ONE framework. So you will have to invent more complex schemes like mentioned "immutability managers" etc..
Anyways, the fact is, that most software will still won't use it anyways, but you will keep paying the price of having it. It's like introducing a new tax "to fund an activities to keep homeowner's lawns green and clean", and taking it from everyone, even from those who don't have own lawn and even don't own the house :)
Colin
-- Best regards, Igor Stasenko.
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
On Thu, Jun 14, 2012 at 12:28 PM, Igor Stasenko siguctua@gmail.com wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
Now we're getting somewhere. Simplicity is a good thing, even if holy wars aren't.
My main concern with immutability is that it's a specific hack to provide a join point for AOP-style management of state. It feels like there's a more general mechanism that we're missing. Maybe it would be useful to have an instance-variable equivalent of ObjectsAsMethods, whatever that might mean. A MOP, maybe? Macros?
OTOH, a more general solution starts to become a PL research project pretty quickly, and would probably never happen.
Colin
Hi:
On 14 Jun 2012, at 21:52, Colin Putney wrote:
On Thu, Jun 14, 2012 at 12:28 PM, Igor Stasenko siguctua@gmail.com wrote:
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
My main concern with immutability is that it's a specific hack to provide a join point for AOP-style management of state. It feels like there's a more general mechanism that we're missing. Maybe it would be useful to have an instance-variable equivalent of ObjectsAsMethods, whatever that might mean. A MOP, maybe? Macros?
OTOH, a more general solution starts to become a PL research project pretty quickly, and would probably never happen.
My current hammer is a metaobject-based MOP. And immutability is a rather small nail ;)
The context is a little different, but indeed, I think Igor needs a MOP. And one of the possible variations is briefly discussed here:
http://soft.vub.ac.be/~smarr/2012/03/identifying-a-unifying-mechanism-for-th...
http://soft.vub.ac.be/~smarr/downloads/tools12-smarr-dhondt-identifying-a-un...
Or you chose one from here, Sec. 2.3, I think: http://swp.dcc.uchile.cl/TR/2009/TR_DCC-20091123-013.pdf
Best regards Stefan
On Thu, 14 Jun 2012, Igor Stasenko wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
IIUC you want to push stuff from VM side to the image. And you want to add a properties slot for all objects in the new VM. Why don't you do that in the image? It's the same as adding an instance variable to Object/ProtoObject, isn't it? :)
Levente
-- Best regards, Igor Stasenko.
On 16 June 2012 03:42, Levente Uzonyi leves@elte.hu wrote:
On Thu, 14 Jun 2012, Igor Stasenko wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
IIUC you want to push stuff from VM side to the image. And you want to add a properties slot for all objects in the new VM. Why don't you do that in the image? It's the same as adding an instance variable to Object/ProtoObject, isn't it? :)
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
So, one way or another you will need to change the object format to allow that. And if VM would allow you to do things like you describing, then it will be even better - no extra responsibility from VM, just add an instance var. But your smiley at the end means irony, like i proposing something very silly, right?
So allow me also be ironic/moronic a bit..
Just take a look at nice and beautiful implementation of dependents protocol of Object. So, every Object in system can have additional property , named "dependents". Sounds close to what i proposing.. But of course it would be silly to implement it differently comparing to current implementation. For sure, you should use huge, weak identity dictionary, where you store this property.
But lets move on to Morphs.. ahhh yeah.. morphs came to us from Self where any object could have as many properties as it wants. But since we cannot afford that, what we do? Yeah, we build a piles of mess, then call it MorphExtension and everyone happy, and keep *rolling eyes* at silly idiots like me.
And the last thing.. you could ask: where is Compiled methods store their properties, like pragmas etc? I will answer: At very obvious and right place where you can find it, it is just needs some elaboration: - second to last literal in it's literals array, unless it is reserved by method's selector, then it is held by "AdditionalMethodState", very similar to what both Object and MorphExtension doing. Do doubt, if you do it any other way, you will be just an idiot!
And sure thing, all of these examples showing very beautiful and elegant way how we should organize an object's data structure, when you cannot predict in advance, how many properties it may need.
On 2012-06-15, at 7:28 PM, Igor Stasenko wrote:
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
It's worse than that - if you add an instance variable to Object, it changes the layout of Behavior instances. That instantly crashes the VM, because superclass pointers and method dictionaries aren't where the VM expects them to be.
Levante has a point, though. Management of the properties dictionary doesn't have to be implemented at the VM level, it just has to support a named instance variable in Object.
Colin
On 16 June 2012 07:45, Colin Putney colin@wiresong.com wrote:
On 2012-06-15, at 7:28 PM, Igor Stasenko wrote:
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
It's worse than that - if you add an instance variable to Object, it changes the layout of Behavior instances. That instantly crashes the VM, because superclass pointers and method dictionaries aren't where the VM expects them to be.
Yes, that's why i stressed that arbitrary properties is not a slot, which should be 'known' by an object itself, because most of the times the object in question don't needs to have direct knowledge of the nature of properties stored there.
This comes from the the problem of adding properties, which can come from completely different, often anonymous layers of system or frameworks: you can never know what idea could come into mind of another framework developer and what additional state(s) he may need to store per object.
Lets just consider a single example with hash: imagine that we removed Sets and Dictionaries from our language. In such system, objects no longer need to be able to care about their "hash" property, since there is no any users of it. Apparently that means that hash property, is an external property to all objects in system, and mean something only to Sets and Dictionaries which know how to reason about it.
Levante has a point, though. Management of the properties dictionary doesn't have to be implemented at the VM level, it just has to support a named instance variable in Object.
Err, where i said anything about dictionary? VM just knows that all objects on heap will have at least one reference slot, aside of other references (ivars/variable vars etc). So, you will need only 2 primitives from VM side to access it: read slot value and write to that slot. The rest is completely up to language side to decide what to do with it. Shall we use dictionary to store properties there? Most probably yes.
Colin
On Sat, 16 Jun 2012, Igor Stasenko wrote:
On 16 June 2012 03:42, Levente Uzonyi leves@elte.hu wrote:
On Thu, 14 Jun 2012, Igor Stasenko wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
IIUC you want to push stuff from VM side to the image. And you want to add a properties slot for all objects in the new VM. Why don't you do that in the image? It's the same as adding an instance variable to Object/ProtoObject, isn't it? :)
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
Right, Colin also mentioned another contract that would make it impossible with the current VM, but we're talking about a new VM, aren't we?
So, one way or another you will need to change the object format to allow that. And if VM would allow you to do things like you describing, then it will be even better - no extra responsibility from VM, just add an instance var. But your smiley at the end means irony, like i proposing something very silly, right?
Not at all, it's about the simplicity of this idea.
Btw, I'm not sure if it's worth adding an extra slot to every object. Maybe it's worth using a bit in the object header to mark objects which have the properies slot and keep the slot management at the VM side. When the slot is about to be accessed check the bit and use lazy become if there's none yet. Of course someone should do the hard work and make some measurements before deciding about this n+1th idea.
So allow me also be ironic/moronic a bit..
Just take a look at nice and beautiful implementation of dependents protocol of Object. So, every Object in system can have additional property , named "dependents". Sounds close to what i proposing.. But of course it would be silly to implement it differently comparing to current implementation. For sure, you should use huge, weak identity dictionary, where you store this property.
But lets move on to Morphs.. ahhh yeah.. morphs came to us from Self where any object could have as many properties as it wants. But since we cannot afford that, what we do? Yeah, we build a piles of mess, then call it MorphExtension and everyone happy, and keep *rolling eyes* at silly idiots like me.
And the last thing.. you could ask: where is Compiled methods store their properties, like pragmas etc? I will answer: At very obvious and right place where you can find it, it is just needs some elaboration:
- second to last literal in it's literals array, unless it is reserved
by method's selector, then it is held by "AdditionalMethodState", very similar to what both Object and MorphExtension doing. Do doubt, if you do it any other way, you will be just an idiot!
And sure thing, all of these examples showing very beautiful and elegant way how we should organize an object's data structure, when you cannot predict in advance, how many properties it may need.
These could be simplified/eliminated with your proposal and you could also use the properties slot to add immutability as you suggested, but I still don't see how these (and other external tools' and frameworks') different uses of the properties slot won't conflict with each other.
Levente
-- Best regards, Igor Stasenko.
On 16 June 2012 18:01, Levente Uzonyi leves@elte.hu wrote:
On Sat, 16 Jun 2012, Igor Stasenko wrote:
On 16 June 2012 03:42, Levente Uzonyi leves@elte.hu wrote:
On Thu, 14 Jun 2012, Igor Stasenko wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
IIUC you want to push stuff from VM side to the image. And you want to add a properties slot for all objects in the new VM. Why don't you do that in the image? It's the same as adding an instance variable to Object/ProtoObject, isn't it? :)
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
Right, Colin also mentioned another contract that would make it impossible with the current VM, but we're talking about a new VM, aren't we?
yes. Then i agree. We could have a contract that all objects will have at least 1 instance var (or call it reference), and VM don't needs any knowledge and should not put any additional semantics related to that ivar, except from providing a way to access it. Then, as before, VM is free to add extra semantic value to next slots, like in case of Behavior and indexes of supeclass , method dictionary and format ivars. So, in practice it will mean that we will have to increase offsets of all currently 'contracted' ivars in VM (and plugins) by 1.
So, one way or another you will need to change the object format to allow that. And if VM would allow you to do things like you describing, then it will be even better - no extra responsibility from VM, just add an instance var. But your smiley at the end means irony, like i proposing something very silly, right?
Not at all, it's about the simplicity of this idea.
Btw, I'm not sure if it's worth adding an extra slot to every object. Maybe it's worth using a bit in the object header to mark objects which have the properies slot and keep the slot management at the VM side. When the slot is about to be accessed check the bit and use lazy become if there's none yet. Of course someone should do the hard work and make some measurements before deciding about this n+1th idea.
Well, we can have it either mandatory or optional (if bit set). It is implementation detail and we actually should test which one is more efficient in terms of speed/space tradeoff :)
But if it mandatory, then it combines quite well with lazy become. The lazy become requires at least one extra word per object to store forwarding pointer there, because you cannot replace an object's header with forwarding pointer, because header holds an information how much memory is reserved for given object, and if you wipe that information, you will have big problems walking the memory heap and condensing unused space.
So allow me also be ironic/moronic a bit..
Just take a look at nice and beautiful implementation of dependents protocol of Object. So, every Object in system can have additional property , named "dependents". Sounds close to what i proposing.. But of course it would be silly to implement it differently comparing to current implementation. For sure, you should use huge, weak identity dictionary, where you store this property.
But lets move on to Morphs.. ahhh yeah.. morphs came to us from Self where any object could have as many properties as it wants. But since we cannot afford that, what we do? Yeah, we build a piles of mess, then call it MorphExtension and everyone happy, and keep *rolling eyes* at silly idiots like me.
And the last thing.. you could ask: where is Compiled methods store their properties, like pragmas etc? I will answer: At very obvious and right place where you can find it, it is just needs some elaboration:
- second to last literal in it's literals array, unless it is reserved
by method's selector, then it is held by "AdditionalMethodState", very similar to what both Object and MorphExtension doing. Do doubt, if you do it any other way, you will be just an idiot!
And sure thing, all of these examples showing very beautiful and elegant way how we should organize an object's data structure, when you cannot predict in advance, how many properties it may need.
These could be simplified/eliminated with your proposal and you could also use the properties slot to add immutability as you suggested, but I still don't see how these (and other external tools' and frameworks') different uses of the properties slot won't conflict with each other.
I do not see how VM could help to solve problems with conflicts between frameworks. Any feature can be abused. Like in case with global namespace, this is something which we should (and should be able) to deal with at language side. But apart from dark sides, look at bright sides as well: different frameworks can actually share same property, if they need to do so.
And last thing, consider Javascript and Self (and many other dynamic languages), which allow you to have as many properties per object as you like, and seems like living quite well with it. Except that since it is their "default" setup, it is of course much less efficient comparing to having a fixed "known" instance variables per instance. But look what V8 guys did: they actually came to same idea, but from another side.
So, there is nothing new in what i proposing: we can have the best of two worlds.
P.S. Oh yes, i did not suggested that immutability bit has to be stored in additional properties slot, because checking it will be even less effective than using bit in header. I just wanted to indicate that actually immutability is a property, and as well as hash has the same "external" nature to all objects: these properties are meaningful only to some certain frameworks/layers in a system, but not globally to every possible class.
Levente
-- Best regards, Igor Stasenko.
On Sat, 16 Jun 2012, Igor Stasenko wrote:
On 16 June 2012 18:01, Levente Uzonyi leves@elte.hu wrote:
On Sat, 16 Jun 2012, Igor Stasenko wrote:
On 16 June 2012 03:42, Levente Uzonyi leves@elte.hu wrote:
On Thu, 14 Jun 2012, Igor Stasenko wrote:
To be honest, I'm a little puzzled by the resistance to immutability.
and also, because i am strong proponent of reducing a number of contracts between VM and language. Things, like moving scheduling logic to image side, moving identity hash logic to image side etc etc. which makes VM less complex, by having less contracts. This is a part of my holy crusade against complexity :)
IIUC you want to push stuff from VM side to the image. And you want to add a properties slot for all objects in the new VM. Why don't you do that in the image? It's the same as adding an instance variable to Object/ProtoObject, isn't it? :)
In theory yes, but in practice, existing object format(s) do not allow you to do that: try adding instance variable to any variable class with bytes/words format.
Right, Colin also mentioned another contract that would make it impossible with the current VM, but we're talking about a new VM, aren't we?
yes. Then i agree. We could have a contract that all objects will have at least 1 instance var (or call it reference), and VM don't needs any knowledge and should not put any additional semantics related to that ivar, except from providing a way to access it. Then, as before, VM is free to add extra semantic value to next slots, like in case of Behavior and indexes of supeclass , method dictionary and format ivars. So, in practice it will mean that we will have to increase offsets of all currently 'contracted' ivars in VM (and plugins) by 1.
So, one way or another you will need to change the object format to allow that. And if VM would allow you to do things like you describing, then it will be even better - no extra responsibility from VM, just add an instance var. But your smiley at the end means irony, like i proposing something very silly, right?
Not at all, it's about the simplicity of this idea.
Btw, I'm not sure if it's worth adding an extra slot to every object. Maybe it's worth using a bit in the object header to mark objects which have the properies slot and keep the slot management at the VM side. When the slot is about to be accessed check the bit and use lazy become if there's none yet. Of course someone should do the hard work and make some measurements before deciding about this n+1th idea.
Well, we can have it either mandatory or optional (if bit set). It is implementation detail and we actually should test which one is more efficient in terms of speed/space tradeoff :)
But if it mandatory, then it combines quite well with lazy become. The lazy become requires at least one extra word per object to store forwarding pointer there, because you cannot replace an object's header with forwarding pointer, because header holds an information how much memory is reserved for given object, and if you wipe that information, you will have big problems walking the memory heap and condensing unused space.
If objects are aligned to 4 (or 8) bytes then there are 2 (or 3) free bits per pointer. If you use two of these bits to store 3 states: valid object, forwarded object with no slots, forwared object with slots (object size is stored in next slot) and the object header also uses these 2 bits the same way, then you don't need the extra word and you can find out the size of the forwarded objects.
So allow me also be ironic/moronic a bit..
Just take a look at nice and beautiful implementation of dependents protocol of Object. So, every Object in system can have additional property , named "dependents". Sounds close to what i proposing.. But of course it would be silly to implement it differently comparing to current implementation. For sure, you should use huge, weak identity dictionary, where you store this property.
But lets move on to Morphs.. ahhh yeah.. morphs came to us from Self where any object could have as many properties as it wants. But since we cannot afford that, what we do? Yeah, we build a piles of mess, then call it MorphExtension and everyone happy, and keep *rolling eyes* at silly idiots like me.
And the last thing.. you could ask: where is Compiled methods store their properties, like pragmas etc? I will answer: At very obvious and right place where you can find it, it is just needs some elaboration:
- second to last literal in it's literals array, unless it is reserved
by method's selector, then it is held by "AdditionalMethodState", very similar to what both Object and MorphExtension doing. Do doubt, if you do it any other way, you will be just an idiot!
And sure thing, all of these examples showing very beautiful and elegant way how we should organize an object's data structure, when you cannot predict in advance, how many properties it may need.
These could be simplified/eliminated with your proposal and you could also use the properties slot to add immutability as you suggested, but I still don't see how these (and other external tools' and frameworks') different uses of the properties slot won't conflict with each other.
I do not see how VM could help to solve problems with conflicts between frameworks. Any feature can be abused. Like in case with global namespace, this is something which we should (and should be able) to deal with at language side. But apart from dark sides, look at bright sides as well: different frameworks can actually share same property, if they need to do so.
And last thing, consider Javascript and Self (and many other dynamic languages), which allow you to have as many properties per object as you like, and seems like living quite well with it.
But those are different languages.
Except that since it is their "default" setup, it is of course much less efficient comparing to having a fixed "known" instance variables per instance. But look what V8 guys did: they actually came to same idea, but from another side.
What do you mean by "same idea"? IIRC they created classes for js objects behind the scenes.
Levente
So, there is nothing new in what i proposing: we can have the best of two worlds.
P.S. Oh yes, i did not suggested that immutability bit has to be stored in additional properties slot, because checking it will be even less effective than using bit in header. I just wanted to indicate that actually immutability is a property, and as well as hash has the same "external" nature to all objects: these properties are meaningful only to some certain frameworks/layers in a system, but not globally to every possible class.
Levente
-- Best regards, Igor Stasenko.
-- Best regards, Igor Stasenko.
On 17.06.2012 07:09, Levente Uzonyi wrote:
On Sat, 16 Jun 2012, Igor Stasenko wrote:
Well, we can have it either mandatory or optional (if bit set). It is implementation detail and we actually should test which one is more efficient in terms of speed/space tradeoff :)
But if it mandatory, then it combines quite well with lazy become. The lazy become requires at least one extra word per object to store forwarding pointer there, because you cannot replace an object's header with forwarding pointer, because header holds an information how much memory is reserved for given object, and if you wipe that information, you will have big problems walking the memory heap and condensing unused space.
If objects are aligned to 4 (or 8) bytes then there are 2 (or 3) free bits per pointer. If you use two of these bits to store 3 states: valid object, forwarded object with no slots, forwared object with slots (object size is stored in next slot) and the object header also uses these 2 bits the same way, then you don't need the extra word and you can find out the size of the forwarded objects.
Those bits are already used for immediate objects, no?
Cheers, Henry
On Mon, 18 Jun 2012, Henrik Sperre Johansen wrote:
On 17.06.2012 07:09, Levente Uzonyi wrote:
On Sat, 16 Jun 2012, Igor Stasenko wrote:
Well, we can have it either mandatory or optional (if bit set). It is implementation detail and we actually should test which one is more efficient in terms of speed/space tradeoff :)
But if it mandatory, then it combines quite well with lazy become. The lazy become requires at least one extra word per object to store forwarding pointer there, because you cannot replace an object's header with forwarding pointer, because header holds an information how much memory is reserved for given object, and if you wipe that information, you will have big problems walking the memory heap and condensing unused space.
If objects are aligned to 4 (or 8) bytes then there are 2 (or 3) free bits per pointer. If you use two of these bits to store 3 states: valid object, forwarded object with no slots, forwared object with slots (object size is stored in next slot) and the object header also uses these 2 bits the same way, then you don't need the extra word and you can find out the size of the forwarded objects.
Those bits are already used for immediate objects, no?
It's a new object format, everything is possible. Since my suggestion only uses 3 states, there's at least one more which can be used for immediates even if only 2 bits are available.
Levente
Cheers, Henry
On 18.06.2012 14:48, Levente Uzonyi wrote:
On Mon, 18 Jun 2012, Henrik Sperre Johansen wrote:
On 17.06.2012 07:09, Levente Uzonyi wrote:
On Sat, 16 Jun 2012, Igor Stasenko wrote:
Well, we can have it either mandatory or optional (if bit set). It is implementation detail and we actually should test which one is more efficient in terms of speed/space tradeoff :)
But if it mandatory, then it combines quite well with lazy become. The lazy become requires at least one extra word per object to store forwarding pointer there, because you cannot replace an object's header with forwarding pointer, because header holds an information how much memory is reserved for given object, and if you wipe that information, you will have big problems walking the memory heap and condensing unused space.
If objects are aligned to 4 (or 8) bytes then there are 2 (or 3) free bits per pointer. If you use two of these bits to store 3 states: valid object, forwarded object with no slots, forwared object with slots (object size is stored in next slot) and the object header also uses these 2 bits the same way, then you don't need the extra word and you can find out the size of the forwarded objects.
Those bits are already used for immediate objects, no?
It's a new object format, everything is possible.
Sure, everything is possible. But is it realistic/useful?
Since my suggestion only uses 3 states, there's at least one more which can be used for immediates even if only 2 bits are available.
Yes, and you're down to at most 61/60 bit immediate ints/doubles, 59 if the originally proposed lookup-table format were to be used.
Considering only instances of classes with *no* slots (if you don't have state, why are you bothering making instances to begin with?) would end up using less space from this scheme, I don't see how it warrants using twice the header bits, and a lower range for immediates over just ensuring the minimum object size including the header is 16 bytes.
Cheers, Henry
On 2012-06-13, at 14:10, Igor Stasenko wrote:
On 13 June 2012 09:41, Bert Freudenberg bert@freudenbergs.de wrote:
On 2012-06-13, at 05:27, Igor Stasenko wrote:
Another (open) question, is how to deal with immutability in presence of become, i.e.:
mutableCopy := immutableObject shallowCopy. immutableObject becomeForward: mutableCopy.
such kind of things making immutability useless for protecting some critical parts of object memory, like preventing modification of compiled method literals: yes you cannot change the immutable object, but you can replace all pointers to it with it's own mutable copy, which is equivalent to making it mutable again.
Why should the VM allow become of an immutable object?
You can disallow this. But you can only make it harder: i can do it manually - take all objects pointing to immutable one and replace the pointer to it with it's mutable copy. And it is completely legal, except that it will be a bit harder, since done not by primitive.
Okay, I guess you're right. (although the solution would be "don't do that" rather than "immutability is useless").
But become would not replace a reference in an immutable object. Which is a major point of immutable objects: all objects "inside" an immutable object are immutable, too. (although I think even that was up for discussion last time we had this conversation)
Disallowing #become on immutables raising many additional questions:
what is your action when you need to migrate instances of a class due to it's reshaping, while some of them are immutable?
Interesting case. One solution would be to simply fail the class reshape if there are immutable instances. One would have to do a mutable copy + become.
(I bet there is be many other examples, when this will break existing traditional schemes, like working with proxies etc).
It would have to be used sparingly and with care, sure.
I don't wanna spread FUD.. just want to make sure that we ready to answer for every such question.
Yep. That's why we're discussing :)
- Bert -
and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box.
It's true -- one of the busiest occupations for Magma is to maintain its two-way [object<--> oid] dictionary mappings. Dropping that in favor of a fast per-object #oid attribute would be like dropping an anchor.
It seems like per-object attributes could provide some amazing leverage in other domains too. +1.
On Mon, Jun 11, 2012 at 1:36 AM, Igor Stasenko siguctua@gmail.com wrote:
Some extra ideas.
- Avoiding extra header for big sized objects.
I not sure about this, but still ..
according to Eliot's design: 8: slot size (255 => extra header word with large size)
What if we extend size to 16 bits (so in total it will be 65536 slots)
This simply doesn't make sense within the overall context of the header (i.e. relatively large identityHash the same size as the class index). A large size field increases the size of the header for all objects. An extra size word for large objects only increases the size (probably by 8 bytes) for large objects. But that's a very small percentage overhead of at most 8 / (256 * 4), or 0.8%. Few objects are large. The bulk of objects are smaller than 256 slots. Its a no-brainer; have a small size field and overflow only for large objects.
and we have a single flag, pointing how to calculate object size:
flag(0) object size = (size field) * 8 flag(1) object size = 2^ (slot field)
which means that past 2^16 (or how many bits we dedicate to size field in header) all object sizes will be power of two. Since most of the objects will fit under 2^16, we don't lose much. For big arrays, we could have a special collection/array, which will store exact size in it's inst var (and we even don't need to care in cases of Sets/Dicts/OrderedCollections). Also we can actually make it transparent:
Array class>>new: size size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ]
of course, care must be taken for those variable classes which potentially can hold large amounts of bytes (like Bitmap). But i think code can be quickly adopted to this feature of VM, which will simply fail a #new: primitive if size is not power of two for sizes greater than max "exact size" which can fit into size field of header.
- Slot for arbitrary properties.
If you read carefully, Eliot said that for making lazy become it is necessary to always have some extra space per object, even if object don't have any fields:
<<We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. >>
So, this fits quite well with idea of having slot for dynamic properties per object. What if instead of "extending object" when it requires extra properties slot, we just reserve the slot for properties at the very beginning:
[ header ] [ properties slot] ... rest of data ..
so, any object will have that slot. And in case of lazy-become. we can use that slot for holding forwarding pointer. Voila.
- From 2. we going straight back to hash.. VM don't needs to know
such a thing as object's hash, it has no semantic load inside VM, it just answers those bits by a single primitive.
So, why it is kind of enforced inherent property of all objects in system? And why nobody asks, if we have that one, why we could not have more than one or as many as we want? This is my central question around idea of having per-object properties. Once VM will guarantee that any object can have at least one slot for storing object reference (property slot), then it is no longer needed for VM to care about identity hash.
Because it can be implemented completely at language size. But most of all, we are NO longer limited how big/small hash values , which directly converts into bonuses: less hash collisions > more performance. Want 64-bit hash? 128-bit? Whatever you desire:
Object>>identityHash ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]
and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box. Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
-- Best regards, Igor Stasenko.
On 13 June 2012 20:50, Eliot Miranda eliot.miranda@gmail.com wrote:
On Mon, Jun 11, 2012 at 1:36 AM, Igor Stasenko siguctua@gmail.com wrote:
Some extra ideas.
- Avoiding extra header for big sized objects.
I not sure about this, but still ..
according to Eliot's design: 8: slot size (255 => extra header word with large size)
What if we extend size to 16 bits (so in total it will be 65536 slots)
This simply doesn't make sense within the overall context of the header (i.e. relatively large identityHash the same size as the class index). A large size field increases the size of the header for all objects. An extra size word for large objects only increases the size (probably by 8 bytes) for large objects. But that's a very small percentage overhead of at most 8 / (256 * 4), or 0.8%. Few objects are large. The bulk of objects are smaller than 256 slots. Its a no-brainer; have a small size field and overflow only for large objects.
I did not measured in terms of space, but in terms of not having additional word and had to deal with it. Can you please fill the gaps in your design and explain how you perform heap walking. A current design reserves two least significant bits to indicate whether object header is 1, 2 or 3 words.. But from your proposed format, a least significant bits are reserved for slots size field, which can be arbitrary value. so how you implement heap walking and determine whether the first word of a next object is its header or it it's size field.
and we have a single flag, pointing how to calculate object size:
flag(0) object size = (size field) * 8 flag(1) object size = 2^ (slot field)
which means that past 2^16 (or how many bits we dedicate to size field in header) all object sizes will be power of two. Since most of the objects will fit under 2^16, we don't lose much. For big arrays, we could have a special collection/array, which will store exact size in it's inst var (and we even don't need to care in cases of Sets/Dicts/OrderedCollections). Also we can actually make it transparent:
Array class>>new: size size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ]
of course, care must be taken for those variable classes which potentially can hold large amounts of bytes (like Bitmap). But i think code can be quickly adopted to this feature of VM, which will simply fail a #new: primitive if size is not power of two for sizes greater than max "exact size" which can fit into size field of header.
- Slot for arbitrary properties.
If you read carefully, Eliot said that for making lazy become it is necessary to always have some extra space per object, even if object don't have any fields:
<<We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. >>
So, this fits quite well with idea of having slot for dynamic properties per object. What if instead of "extending object" when it requires extra properties slot, we just reserve the slot for properties at the very beginning:
[ header ] [ properties slot] ... rest of data ..
so, any object will have that slot. And in case of lazy-become. we can use that slot for holding forwarding pointer. Voila.
- From 2. we going straight back to hash.. VM don't needs to know
such a thing as object's hash, it has no semantic load inside VM, it just answers those bits by a single primitive.
So, why it is kind of enforced inherent property of all objects in system? And why nobody asks, if we have that one, why we could not have more than one or as many as we want? This is my central question around idea of having per-object properties. Once VM will guarantee that any object can have at least one slot for storing object reference (property slot), then it is no longer needed for VM to care about identity hash.
Because it can be implemented completely at language size. But most of all, we are NO longer limited how big/small hash values , which directly converts into bonuses: less hash collisions > more performance. Want 64-bit hash? 128-bit? Whatever you desire:
Object>>identityHash ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]
and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box. Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
-- Best regards, Igor Stasenko.
-- best, Eliot
On Wed, Jun 13, 2012 at 12:46 PM, Igor Stasenko siguctua@gmail.com wrote:
On 13 June 2012 20:50, Eliot Miranda eliot.miranda@gmail.com wrote:
On Mon, Jun 11, 2012 at 1:36 AM, Igor Stasenko siguctua@gmail.com
wrote:
Some extra ideas.
- Avoiding extra header for big sized objects.
I not sure about this, but still ..
according to Eliot's design: 8: slot size (255 => extra header word with large size)
What if we extend size to 16 bits (so in total it will be 65536 slots)
This simply doesn't make sense within the overall context of the header
(i.e. relatively large identityHash the same size as the class index). A large size field increases the size of the header for all objects. An extra size word for large objects only increases the size (probably by 8 bytes) for large objects. But that's a very small percentage overhead of at most 8 / (256 * 4), or 0.8%. Few objects are large. The bulk of objects are smaller than 256 slots. Its a no-brainer; have a small size field and overflow only for large objects.
I did not measured in terms of space, but in terms of not having additional word and had to deal with it. Can you please fill the gaps in your design and explain how you perform heap walking.
Hmm, that's a detail :) Could you list the gaps you see in the design?
A current design reserves two least significant bits to indicate
whether object header is 1, 2 or 3 words.. But from your proposed format, a least significant bits are reserved for slots size field, which can be arbitrary value. so how you implement heap walking and determine whether the first word of a next object is its header or it it's size field.
OK, here's one way to implement heap walking:
In heap walking the memory manager needs to be able to detect the start of the next object. This is complicated by the short and long header formats, short being for objects with 254 slots or less, long being for objects with 255 slots or more. The class index field can be used to mark special objects. In particular the tagged class indices 1 through 7, which correspond to objects with tag bits 1 through 7 (SmallInteger = 1, 3, 5, 7, Character = e.g. 2, and SmallFloat = e.g. 4) never occur in the class index fields of normal objects. So if the size doubleword uses all bits other than the class field (44 bits is an adequate maximum size of 2^46 bytes, ~ 10^14 bytes) then size doubleword s can be marked by using one of the tag class indexes in its class field. To identify the next object the VM fetches the doubleword immediately following the current object (object bodies being rounded up to 8 bytes in the 32-bit VM). If the doubleword's class index field is the size doubleword class index pun, e.g. 1, then it is a size field and the object header is the doubleword following that, and the object's slots start after that. if not, the object header is that doubleword and the object's slots follow that.
and we have a single flag, pointing how to calculate object size:
flag(0) object size = (size field) * 8 flag(1) object size = 2^ (slot field)
which means that past 2^16 (or how many bits we dedicate to size field in header) all object sizes will be power of two. Since most of the objects will fit under 2^16, we don't lose much. For big arrays, we could have a special collection/array, which will store exact size in it's inst var (and we even don't need to care in cases of Sets/Dicts/OrderedCollections). Also we can actually make it transparent:
Array class>>new: size size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new:
size ]
of course, care must be taken for those variable classes which potentially can hold large amounts of bytes (like Bitmap). But i think code can be quickly adopted to this feature of VM, which will simply fail a #new: primitive if size is not power of two for sizes greater than max "exact size" which can fit into size field of header.
- Slot for arbitrary properties.
If you read carefully, Eliot said that for making lazy become it is necessary to always have some extra space per object, even if object don't have any fields:
<<We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. >>
So, this fits quite well with idea of having slot for dynamic properties per object. What if instead of "extending object" when it requires extra properties slot, we just reserve the slot for properties at the very beginning:
[ header ] [ properties slot] ... rest of data ..
so, any object will have that slot. And in case of lazy-become. we can use that slot for holding forwarding pointer. Voila.
- From 2. we going straight back to hash.. VM don't needs to know
such a thing as object's hash, it has no semantic load inside VM, it just answers those bits by a single primitive.
So, why it is kind of enforced inherent property of all objects in system? And why nobody asks, if we have that one, why we could not have more than one or as many as we want? This is my central question around idea of having per-object properties. Once VM will guarantee that any object can have at least one slot for storing object reference (property slot), then it is no longer needed for VM to care about identity hash.
Because it can be implemented completely at language size. But most of all, we are NO longer limited how big/small hash values , which directly converts into bonuses: less hash collisions > more performance. Want 64-bit hash? 128-bit? Whatever you desire:
Object>>identityHash ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ]
and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box. Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there.
-- Best regards, Igor Stasenko.
-- best, Eliot
-- Best regards, Igor Stasenko.
On 14 June 2012 01:08, Eliot Miranda eliot.miranda@gmail.com wrote:
On Wed, Jun 13, 2012 at 12:46 PM, Igor Stasenko siguctua@gmail.com wrote:
On 13 June 2012 20:50, Eliot Miranda eliot.miranda@gmail.com wrote:
On Mon, Jun 11, 2012 at 1:36 AM, Igor Stasenko siguctua@gmail.com wrote:
Some extra ideas.
- Avoiding extra header for big sized objects.
I not sure about this, but still ..
according to Eliot's design: 8: slot size (255 => extra header word with large size)
What if we extend size to 16 bits (so in total it will be 65536 slots)
This simply doesn't make sense within the overall context of the header (i.e. relatively large identityHash the same size as the class index). A large size field increases the size of the header for all objects. An extra size word for large objects only increases the size (probably by 8 bytes) for large objects. But that's a very small percentage overhead of at most 8 / (256 * 4), or 0.8%. Few objects are large. The bulk of objects are smaller than 256 slots. Its a no-brainer; have a small size field and overflow only for large objects.
I did not measured in terms of space, but in terms of not having additional word and had to deal with it. Can you please fill the gaps in your design and explain how you perform heap walking.
Hmm, that's a detail :) Could you list the gaps you see in the design?
Hehe.. If you like so.. It is not clear, how many points in VM should be aware of forwarded objects and where you can just pass them around. Obviously, the places where you need to access object's data will require that check.. But i fear this could be too costly. If this check will be placed at slot read operation , then every slot read will mean two memory reads. (read reference value, then read it's pointer and then check if it refers to forwarded oop). If we cannot avoid that, then "lazy" become will mean "crawling" runtime :)
Actually, from other side, if something reads an oop, then in most of the cases it will need to access its contents at some point (the only exception copy/assignment operation(s)). So, by doing this check it will just put an object's contents in CPU cache, but i think the slowdown will be still too significant to just ignore it.
So, i'd like to know what is your thoughts about minimizing this bad impact. I am interested, how many places we can have, which will have strong guarantees that no forwarded oops will ever appear there. Or we cannot have such places and doomed to always keep checking oops at every read?
A current design reserves two least significant bits to indicate whether object header is 1, 2 or 3 words.. But from your proposed format, a least significant bits are reserved for slots size field, which can be arbitrary value. so how you implement heap walking and determine whether the first word of a next object is its header or it it's size field.
OK, here's one way to implement heap walking:
In heap walking the memory manager needs to be able to detect the start of the next object. This is complicated by the short and long header formats, short being for objects with 254 slots or less, long being for objects with 255 slots or more. The class index field can be used to mark special objects. In particular the tagged class indices 1 through 7, which correspond to objects with tag bits 1 through 7 (SmallInteger = 1, 3, 5, 7, Character = e.g. 2, and SmallFloat = e.g. 4) never occur in the class index fields of normal objects. So if the size doubleword uses all bits other than the class field (44 bits is an adequate maximum size of 2^46 bytes, ~ 10^14 bytes) then size doubleword s can be marked by using one of the tag class indexes in its class field. To identify the next object the VM fetches the doubleword immediately following the current object (object bodies being rounded up to 8 bytes in the 32-bit VM). If the doubleword's class index field is the size doubleword class index pun, e.g. 1, then it is a size field and the object header is the doubleword following that, and the object's slots start after that. if not, the object header is that doubleword and the object's slots follow that.
Okay. Nice trick. :)
So, can we have a slot for arbitrary properties? Then you can free bits in header reserved for hash, and can expand both size and class index fields.
vm-dev@lists.squeakfoundation.org