Re: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

List overview All Threads
Download

newer

older

VM Maker: VMMaker.oscog-eem.413.mcz

[commit][2787] Include the pc in...

Eliot Miranda

24 Sep 2013 24 Sep '13

9:58 p.m.

On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

But for NB see below on the hack that the ThreadedFFI uses.

...

2013/9/23 Igor Stasenko siguctua@gmail.com

...
On 23 September 2013 21:40, Igor Stasenko siguctua@gmail.com wrote:

...
On 23 September 2013 16:23, Camillo Bruni camillobruni@gmail.comwrote:

...
Hi Jan,

I think I will add the ByteArray accessor to NBExternalAddress today or tomorrow since I need it as well for another project.

hmm, reading from memory into bytearray can be done with memory copy:

inputs: address , offset , size to read

newAddress := NBExternalAddress value: address value + offset. buffer := ByteArray new: size. NativeBoost memCopy: newAddress to: buffer size: size.

same way, writing, just swap the source and destination:

newAddress := NBExternalAddress value: address value + offset. buffer "is given from somewhere". NativeBoost memCopy: buffer to: newAddress size: size.

but as Jan noted, you cannot tell to write starting at specified offset from/to bytearray, e.g.:

copy from: address to: buffer + someOffset neither: copy from: buffer + someOffset to: someAddress

this where we need to introduce special 'field address' type, so you can construct it like this:

offsetAddress := buffer nbAddressAt: offset.

so then you can use it to pass to any function, which expects address, like memory copy or any foreign function.

Since objects are moving in memory, we cannot calculate address of field before hand:

address := NBExternalAddress value: someObject address + offset.

because if GC will happen, after computing such address and its actual use, you will read/write to wrong location.

** after computing and *before* actual use **

...
Thus we should keep oop + offset up to the point of passing it to external function, under controllable conditions, that guarantee there's no GC is possible.

Things would be much simpler if we could have pinning, isnt? :)

Yes, but for the moment there is a hack one can use, a neat hack invented by Andreas Raab. The Squeak GC is a two-space GC, old space (collected by fullGC) and new space (collected by incrementalGC). An incrementalGC will move objects in new space but leave objects on old space alone. A tenuringIncrementalGC will compact new space and then make new space part of old space. Therefore one way of nearly pinning objects is to do a young GC to tenure objects into old space via tenuringIncrementalGC and then lock fullGC, prevent fullGC from running, until the external call is finished. All arguments to the call become old, and they won't be moved until the fuuGCLock is released. This doesn't help passing a buffer that will be used after the call returns, but it does help a buffer being passed to code that might callback.

See uses of PrimErrObjectMayMove, e.g. ThreadedFFIPlugin>>primitiveCallout and platforms/Cross/plugins/FilePlugin/sqFilePluginBasicPrims.c>>sqFileReadIntoAt

HTH eliot

Attachments:

attachment.html (text/html — 6.4 KB)

Show replies by date

Henrik Johansen

25 Sep 25 Sep

10:14 a.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On Sep 24, 2013, at 9:58 , Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote: Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

Not to be a drag, but SSE operations have 16-byte alignment requirement. With 8-byte aligmnent of first indexable slot, you'd still need to use aligned loads/stores.

Cheers, Henry

Eliot Miranda

6:30 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On Wed, Sep 25, 2013 at 1:14 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:

...

On Sep 24, 2013, at 9:58 , Eliot Miranda eliot.miranda@gmail.com wrote:

On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

Not to be a drag, but SSE operations have 16-byte alignment requirement. With 8-byte aligmnent of first indexable slot, you'd still need to use aligned loads/stores.

Cheers, Henry

Hi Henry, thanks. That's potentially tedious to achieve because objects, having either 8-byte or 16-byte headers, are not trivial to align such that their first field is always on a 16-byte boundary. But since we're only talking about pinned objects in old space, it should be reasonably straight-forward to add the constraint that pinned objects are so aligned. Off the top of my head it may need the memory manager to add a "sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

-- best, Eliot

tim Rowledge

7:39 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Off the top of my head it may need the memory manager to add a "sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

Wild-ass suggestion; would it work to allocate the relevant object (which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Klingon Code Warrior:- 4) "This machine is a piece of GAGH! I need dual G5 processors if I am to do battle with this code!"

Eliot Miranda

8:36 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On Wed, Sep 25, 2013 at 10:39 AM, tim Rowledge tim@rowledge.org wrote:

...

On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
Off the top of my head it may need the memory manager to add a "sliver"

object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

...
Wild-ass suggestion; would it work to allocate the relevant object (which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

In general that's a bad idea because it loads the frequent path (accessing the object) with the work, unloading the infrequent path (pinning the object). It can't happen during scavenge because by definition pinned objects are not in new space. hence I think the time to take the hit is on pinning (an infrequent operation). Make sense?

-- best, Eliot

tim Rowledge

9:46 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On 25-09-2013, at 11:36 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Wed, Sep 25, 2013 at 10:39 AM, tim Rowledge tim@rowledge.org wrote:

On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
Off the top of my head it may need the memory manager to add a "sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

Wild-ass suggestion; would it work to allocate the relevant object (which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

In general that's a bad idea because it loads the frequent path (accessing the object) with the work, unloading the infrequent path (pinning the object). It can't happen during scavenge because by definition pinned objects are not in new space. hence I think the time to take the hit is on pinning (an infrequent operation). Make sense?

Almost always. But it's worth throwing wild ideas around occasionally just to see if they stick to the blanket. How about two subclasses - one for each possible 8byte alignment? The FFI code only gets a ptr to the right place anyway, so it won't care. Accessor methods would be minimally different for the two subclasses and handle the offset for ST code. Compacting, if it ever touched instances, would swap the class if required. How's that for adaptive optimisation ;-)

Or how about simply not allocating the actual memory in object space? I know it can be a pain but perhaps it's less of a pain in the end.

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: DMZ: Divide Memory by Zero

Eliot Miranda

10:18 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

Hi Tim, Hi All,

On Wed, Sep 25, 2013 at 12:46 PM, tim Rowledge tim@rowledge.org wrote:

...

On 25-09-2013, at 11:36 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Wed, Sep 25, 2013 at 10:39 AM, tim Rowledge tim@rowledge.org wrote:

On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com

wrote:

...
...
Off the top of my head it may need the memory manager to add a

"sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

...
...
Wild-ass suggestion; would it work to allocate the relevant object

(which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

...
In general that's a bad idea because it loads the frequent path

(accessing the object) with the work, unloading the infrequent path (pinning the object). It can't happen during scavenge because by definition pinned objects are not in new space. hence I think the time to take the hit is on pinning (an infrequent operation). Make sense?

Almost always. But it's worth throwing wild ideas around occasionally just to see if they stick to the blanket.

Yes!! I was almost going to post something on just this point. Thanks for the prompt. I want to encourage everyone to throw ideas out there without fear of their being proved wrong. Discussing the ideas has great value whether or not they;re right. The other day Clement had a good idea about become. I'd not thought about the issue he brought up before, and his suggestion, even though I don't think it'll work, made me think about the issue and helped clarify my thinking. So yes, please keep throwing out those ideas. And don't get discouraged by critical responses.

...

How about two subclasses - one for each possible 8byte alignment? The FFI code only gets a ptr to the right place anyway, so it won't care. Accessor methods would be minimally different for the two subclasses and handle the offset for ST code. Compacting, if it ever touched instances, would swap the class if required. How's that for adaptive optimisation ;-)

That could work, but offends my sense of the separation between the system and the VM. So I don't feel like going there :-)

...

Or how about simply not allocating the actual memory in object space? I know it can be a pain but perhaps it's less of a pain in the end.

That one can always do. The FFI will always provide some sort of interface to malloc and/or valloc (even if it is as primitive as actually calling those functions). But the question for Spur is what facilities it should provide and I want to provide something much easier to use (i.e. pinned ByteArrays and the like).

-- cheers, Eliot

Nicolas Cellier

11:56 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

Alignment apart, one thing that could be useful is the ability to pass the address of a byte array, offsetted at will. For example, I have an array of unboxed double complex (real0,imaginary0,real1,imaginary1,...) And I want to copy the imaginary part only in another location. Blas provides enough methods for that, I just need to pass - ByteArray adress+8 bytes (+1 sizeof double) - half array length - and a stride of 16 bytes (2 size of double). In most Smalltalk dialects, this is currently only possible with External Heap, but not with Smalltalk managed memory... OK, some guru will tell me to do it with BitBlt, but ther must be something easier...

I feel like this would open many possibilities for interpreting low level languages (like invoking some primitives on emulated memory blob).

2013/9/25 Eliot Miranda eliot.miranda@gmail.com

...

Hi Tim, Hi All,

On Wed, Sep 25, 2013 at 12:46 PM, tim Rowledge tim@rowledge.org wrote:

...
On 25-09-2013, at 11:36 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Wed, Sep 25, 2013 at 10:39 AM, tim Rowledge tim@rowledge.org

wrote:

...
On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com

wrote:

...
...
Off the top of my head it may need the memory manager to add a

"sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

...
...
Wild-ass suggestion; would it work to allocate the relevant object

(which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

...
In general that's a bad idea because it loads the frequent path

(accessing the object) with the work, unloading the infrequent path (pinning the object). It can't happen during scavenge because by definition pinned objects are not in new space. hence I think the time to take the hit is on pinning (an infrequent operation). Make sense?

Almost always. But it's worth throwing wild ideas around occasionally just to see if they stick to the blanket.

Yes!! I was almost going to post something on just this point. Thanks for the prompt. I want to encourage everyone to throw ideas out there without fear of their being proved wrong. Discussing the ideas has great value whether or not they;re right. The other day Clement had a good idea about become. I'd not thought about the issue he brought up before, and his suggestion, even though I don't think it'll work, made me think about the issue and helped clarify my thinking. So yes, please keep throwing out those ideas. And don't get discouraged by critical responses.

...
How about two subclasses - one for each possible 8byte alignment? The FFI code only gets a ptr to the right place anyway, so it won't care. Accessor methods would be minimally different for the two subclasses and handle the offset for ST code. Compacting, if it ever touched instances, would swap the class if required. How's that for adaptive optimisation ;-)

That could work, but offends my sense of the separation between the system and the VM. So I don't feel like going there :-)

...
Or how about simply not allocating the actual memory in object space? I know it can be a pain but perhaps it's less of a pain in the end.

That one can always do. The FFI will always provide some sort of interface to malloc and/or valloc (even if it is as primitive as actually calling those functions). But the question for Spur is what facilities it should provide and I want to provide something much easier to use (i.e. pinned ByteArrays and the like).

-- cheers, Eliot

Eliot Miranda

26 Sep 26 Sep

12:28 a.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

Hi Nicolas,

On Wed, Sep 25, 2013 at 2:56 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

Alignment apart, one thing that could be useful is the ability to pass the address of a byte array, offsetted at will. For example, I have an array of unboxed double complex (real0,imaginary0,real1,imaginary1,...) And I want to copy the imaginary part only in another location. Blas provides enough methods for that, I just need to pass

ByteArray adress+8 bytes (+1 sizeof double)

half array length

and a stride of 16 bytes (2 size of double).

In most Smalltalk dialects, this is currently only possible with External Heap, but not with Smalltalk managed memory... OK, some guru will tell me to do it with BitBlt, but ther must be something easier...

I feel like this would open many possibilities for interpreting low level languages (like invoking some primitives on emulated memory blob).

Yes, good idea. David Leibs proposed this for DLLCC a while back and I implemented it even though IIRC there's no image-level support for this. It's easy to add to FFI argument parsing. Consider it on the list, but you could remind me and/or Igor when we revisit the ThreadedFFI & NativeBoost once Spur is functional.

...

2013/9/25 Eliot Miranda eliot.miranda@gmail.com

...
Hi Tim, Hi All,

On Wed, Sep 25, 2013 at 12:46 PM, tim Rowledge tim@rowledge.org wrote:

...
On 25-09-2013, at 11:36 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Wed, Sep 25, 2013 at 10:39 AM, tim Rowledge tim@rowledge.org

wrote:

...
On 25-09-2013, at 9:30 AM, Eliot Miranda eliot.miranda@gmail.com

wrote:

...
...
Off the top of my head it may need the memory manager to add a

"sliver" object type, an 8-byte header with no fields, that is collected by coallescing with adjacent free blocks, that can fill in the 8-byte gaps aligning pinned objects on 16 byte boundaries would create. I'll think about this.

...
...
Wild-ass suggestion; would it work to allocate the relevant object

(which I'm assuming here is essentially an UninterpretedBytes) with an 'extra' 8 bytes and then store the 'real' bits at the right place to have the 16byte boundary? Yes, there would have to be magic to signify where the real start of data is. Yes, it would potentially change on scavenge, but since these are largely pinned, not such a big issue maybe?

...
In general that's a bad idea because it loads the frequent path

(accessing the object) with the work, unloading the infrequent path (pinning the object). It can't happen during scavenge because by definition pinned objects are not in new space. hence I think the time to take the hit is on pinning (an infrequent operation). Make sense?

Almost always. But it's worth throwing wild ideas around occasionally just to see if they stick to the blanket.

Yes!! I was almost going to post something on just this point. Thanks for the prompt. I want to encourage everyone to throw ideas out there without fear of their being proved wrong. Discussing the ideas has great value whether or not they;re right. The other day Clement had a good idea about become. I'd not thought about the issue he brought up before, and his suggestion, even though I don't think it'll work, made me think about the issue and helped clarify my thinking. So yes, please keep throwing out those ideas. And don't get discouraged by critical responses.

...
How about two subclasses - one for each possible 8byte alignment? The FFI code only gets a ptr to the right place anyway, so it won't care. Accessor methods would be minimally different for the two subclasses and handle the offset for ST code. Compacting, if it ever touched instances, would swap the class if required. How's that for adaptive optimisation ;-)

That could work, but offends my sense of the separation between the system and the VM. So I don't feel like going there :-)

...
Or how about simply not allocating the actual memory in object space? I know it can be a pain but perhaps it's less of a pain in the end.

That one can always do. The FFI will always provide some sort of interface to malloc and/or valloc (even if it is as primitive as actually calling those functions). But the question for Spur is what facilities it should provide and I want to provide something much easier to use (i.e. pinned ByteArrays and the like).

-- cheers, Eliot

-- best, Eliot

Eliot Miranda

25 Sep 25 Sep

9:36 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

Hi Henry,

On Wed, Sep 25, 2013 at 1:14 AM, Henrik Johansen < henrik.s.johansen@veloxit.no> wrote:

...

On Sep 24, 2013, at 9:58 , Eliot Miranda eliot.miranda@gmail.com wrote:

On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

Not to be a drag, but SSE operations have 16-byte alignment requirement. With 8-byte aligmnent of first indexable slot, you'd still need to use aligned loads/stores.

Talking out-of-line with someone who has lots of experience with Smalltalk memory management it came to me that the pinning primitive could (and should) take an aligmnet argument). So one says e.g.

buffer := (ByteArray new: 1024) pinWithByteAlignment: 16

-- best, Eliot

Louis LaBrunda

26 Sep 26 Sep

12:06 a.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On Wed, 25 Sep 2013 12:36:53 -0700, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Talking out-of-line with someone who has lots of experience with Smalltalk memory management it came to me that the pinning primitive could (and should) take an alignment argument). So one says e.g.

...

buffer := (ByteArray new: 1024) pinWithByteAlignment: 16

In the spirit of throwing out ideas/questions to see what sticks/helps/what-ever: would it be possible/better to do this:

buffer := ByteArray new: 1024 pinWithByteAlignment: 16.

instead of:

buffer := (ByteArray new: 1024) pinWithByteAlignment: 16.

I know it moves the pinning/alignment to the class side and that may be a problem but it seems to me it would be better to pin/align when the objects is created than after that when it may have to be moved. Or am I way off base here?

Lou ----------------------------------------------------------- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon mailto:Lou@Keystone-Software.com http://www.Keystone-Software.com

Henrik Johansen

10:20 a.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On Sep 25, 2013, at 9:36 , Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Henry,

On Wed, Sep 25, 2013 at 1:14 AM, Henrik Johansen henrik.s.johansen@veloxit.no wrote:

On Sep 24, 2013, at 9:58 , Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote: Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

Not to be a drag, but SSE operations have 16-byte alignment requirement. With 8-byte aligmnent of first indexable slot, you'd still need to use aligned loads/stores.

Talking out-of-line with someone who has lots of experience with Smalltalk memory management it came to me that the pinning primitive could (and should) take an aligmnet argument). So one says e.g.

buffer := (ByteArray new: 1024) pinWithByteAlignment: 16

-- best, Eliot

I feel this, plus an FFI that lets you specify parameter alignment tags*, and pin the arguments accordingly, would be an excellent solution. For example, AVX ops already requires 32-byte alignment. Covering all the cases where alignment might be an issue seems like a hefty task for the allocator to care about by default.

Might be a good idea to make sure they're padded to a corresponding multiple of bytes as well, just to be safe :) (Say, some library using movapd to store the results back as well, without regarding odd-numbers of elements, thus potentially overwriting the next objects header)

Cheers, Henry

P.S. For the record, alignment isn't ever an issue when writing your own SSE functions using NB. You'd simply \16/32 the provided pointer, do aligned loads, and mask out the leading/trailing bytes when storing using maskmovdqu/vmaskmov** ;)

*A custom FFI interrupt handler for #GP returning an error instead of crashing when an alignment tag is missing would be out of the question though, right? ;) ** Well, if someone gets around to implementing the VEX-encoded instructions in NB, that is

Igor Stasenko

4:02 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

On 24 September 2013 21:58, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Mon, Sep 23, 2013 at 3:10 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
Isn't Eliot just implementing this feature, having a segment of non relocatable objects?

Sort of. Spur will allow any object in old space to stay still, and through become can move any object to old space very simply. So instead of having a special fixed space segment it has a best-bit compaction algorithm that doesn't slide objects, but moves them into holes. With this kind of compaction it is very easy to leave objects put.

A further advantage of Spur is that objects have 64-bit alignment so passing arrays to e.g. code using sse instructions, won't cause potential alignment faults.

But for NB see below on the hack that the ThreadedFFI uses.

...
2013/9/23 Igor Stasenko siguctua@gmail.com

...
On 23 September 2013 21:40, Igor Stasenko siguctua@gmail.com wrote:

...
On 23 September 2013 16:23, Camillo Bruni camillobruni@gmail.comwrote:

...
Hi Jan,

I think I will add the ByteArray accessor to NBExternalAddress today or tomorrow since I need it as well for another project.

hmm, reading from memory into bytearray can be done with memory copy:

inputs: address , offset , size to read

newAddress := NBExternalAddress value: address value + offset. buffer := ByteArray new: size. NativeBoost memCopy: newAddress to: buffer size: size.

same way, writing, just swap the source and destination:

newAddress := NBExternalAddress value: address value + offset. buffer "is given from somewhere". NativeBoost memCopy: buffer to: newAddress size: size.

but as Jan noted, you cannot tell to write starting at specified offset from/to bytearray, e.g.:

copy from: address to: buffer + someOffset neither: copy from: buffer + someOffset to: someAddress

this where we need to introduce special 'field address' type, so you can construct it like this:

offsetAddress := buffer nbAddressAt: offset.

so then you can use it to pass to any function, which expects address, like memory copy or any foreign function.

Since objects are moving in memory, we cannot calculate address of field before hand:

address := NBExternalAddress value: someObject address + offset.

because if GC will happen, after computing such address and its actual use, you will read/write to wrong location.

** after computing and *before* actual use **

...
Thus we should keep oop + offset up to the point of passing it to external function, under controllable conditions, that guarantee there's no GC is possible.

Things would be much simpler if we could have pinning, isnt? :)

Yes, but for the moment there is a hack one can use, a neat hack invented by Andreas Raab. The Squeak GC is a two-space GC, old space (collected by fullGC) and new space (collected by incrementalGC). An incrementalGC will move objects in new space but leave objects on old space alone. A tenuringIncrementalGC will compact new space and then make new space part of old space. Therefore one way of nearly pinning objects is to do a young GC to tenure objects into old space via tenuringIncrementalGC and then lock fullGC, prevent fullGC from running, until the external call is finished. All arguments to the call become old, and they won't be moved until the fuuGCLock is released. This doesn't help passing a buffer that will be used after the call returns, but it does help a buffer being passed to code that might callback.

This recipe feels like taking roots from voodoo cult :) and quite unreliable and limiting: - to ensure more or less 'stable' system behavior, you need to do fullGC before every such call - where is guarantee that user-code which implements callback won't produce enough garbage to force full GC?

I'd prefer having strong public contract with VM that allows me to pin object(s) in memory, instead of relying on private details of VM's internals or, what is worse, on user's sanity :)

Besides, one thing i found lately: its a pity that our object format don't using butterfly layout.. i realized that looking at some code, which tries pass an array of pointers.. and doing a lot of manual conversions for that.. while if we could have a butterfly headers, we could just pass an object pointer (like Array instance) as C array of pointers.

...

See uses of PrimErrObjectMayMove, e.g. ThreadedFFIPlugin>>primitiveCallout and platforms/Cross/plugins/FilePlugin/sqFilePluginBasicPrims.c>>sqFileReadIntoAt

HTH eliot

-- Best regards, Igor Stasenko.

Igor Stasenko

4:40 p.m.

New subject: [Pharo-dev] [NB] Trying to implement a ReadStream on NBExternalAddress

if we started about ideas..

what about good-old indirection? :)

we could use extra object format with indirection to its data, with following layout: <header bitOr: PINNED><data pointer>

so, the data (only for variable bytes, not var-references, of course) associated with such object can be located in separate non-movable region and you need to do extra work when scavenging objects with such special format, but for the rest, it almost the same: - the object (oop) itself can still move in memory, no big deal since its data pointer remains the same

extending the idea, we could have even objects which pointing inside of an object.. like with Nicolas'es unboxed complex numbers, i could say: i want an object which points to imaginary field of unboxed complex value. so the complex value can be represented as a pinned object big enough to hold 2 floats:

and the object representing an imaginary float part can be following:

the main difference between PINNED and PINNED-SLAVE, that during scavenging, GC will ignore and won't deallocate the buffer pointed to by the slave, that's why it holds a reference to its master (<complex oop> in the example) to prevent deallocation/garbage collection of master and allocated buffer.

in fact, PINNED-SLAVE format can be used for pointing in memory to anything (even outside memory managed by GC, because it never (de)allocates that memory), like that i can easily represent some data blob/buffer provided by external library as ByteArray object in image:

-- Best regards, Igor Stasenko.

3888

Age (days ago)

3890

Last active (days ago)

vm-dev@lists.squeakfoundation.org

13 comments

6 participants

tags (0)

participants (6)

Eliot Miranda
Henrik Johansen
Igor Stasenko
Louis LaBrunda
Nicolas Cellier
tim Rowledge