Primitive 40 (asFloat) fails for me

List overview All Threads
Download

newer

older

An actual paper about something I...

Pharo7 stock image shows a blank...

Christian Kellermann

19 Feb 2019 19 Feb '19

12:25 p.m.

Hi,

I am currently using the latest VM with commit f9ae4a1479122b448fcfdc80bb2caa09b53aa474

On a Squeak image:

----- /home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image Squeak5.2 latest update: #18231 Current Change Set: Unnamed1 Image format 68021 (64 bit)

Virtual Machine --------------- /home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509] Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127 platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019 StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019

-100 asFloat returns 100 for me. Can anyone confirm this?

The Float tests from the Kernel-Number category also fail. Is my image out of sync with the VM?

Kind regards,

Christian

-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.

Attachments:

signature.asc (application/pgp-signature — 801 bytes)

Show replies by date

Tobias Pape

19 Feb 19 Feb

5:02 p.m.

Hi,

...

On 19.02.2019, at 12:25, Christian Kellermann ckeen@pestilenz.org wrote:

Hi,

I am currently using the latest VM with commit f9ae4a1479122b448fcfdc80bb2caa09b53aa474

On a Squeak image:

/home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image Squeak5.2 latest update: #18231 Current Change Set: Unnamed1 Image format 68021 (64 bit)

Virtual Machine

/home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509] Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127 platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019 StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019

What Platform [os/cpu] is this on?

Best regards -Tobias

...

-100 asFloat returns 100 for me. Can anyone confirm this?

The Float tests from the Kernel-Number category also fail. Is my image out of sync with the VM?

Kind regards,

Christian

-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.

Christian Kellermann

5:05 p.m.

* Tobias Pape Das.Linux@gmx.de [190219 17:02]:

...

Hi,

...
On 19.02.2019, at 12:25, Christian Kellermann ckeen@pestilenz.org wrote:

Hi,

I am currently using the latest VM with commit f9ae4a1479122b448fcfdc80bb2caa09b53aa474

On a Squeak image:

/home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image Squeak5.2 latest update: #18231 Current Change Set: Unnamed1 Image format 68021 (64 bit)

Virtual Machine

/home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.2509] Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127 platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm CoInterpreter VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019 StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid: 91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019

What Platform [os/cpu] is this on?

Linux/amd64 sorry for omitting this...

-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.

Eliot Miranda

6:17 p.m.

Hi Christian,

On Tue, Feb 19, 2019 at 8:05 AM Christian Kellermann ckeen@pestilenz.org wrote:

...

Tobias Pape Das.Linux@gmx.de [190219 17:02]:

...
...
Hi,

...
On 19.02.2019, at 12:25, Christian Kellermann ckeen@pestilenz.org

wrote:

...
...
Hi,

I am currently using the latest VM with commit f9ae4a1479122b448fcfdc80bb2caa09b53aa474

On a Squeak image:

/home/ckeen/Sync/src/Muffin/Squeak5.2-18225-64bit.dependents.image Squeak5.2 latest update: #18231 Current Change Set: Unnamed1 Image format 68021 (64 bit)

Virtual Machine

/home/ckeen/src/playground/opensmalltalk-vm/products/sqcogspur64linuxht/lib/squeak/5.0-201902181742/squeak

...
...
Croquet Closure Cog[Spur] VM [CoInterpreterPrimitives

VMMaker.oscog-eem.2509]

...
...
Unix built on Feb 19 2019 08:58:47 Compiler: 8.2.1 20181127 platform sources revision VM: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm

Date: Mon Feb 18 09:42:23 2019 CommitHash: f9ae4a147 Plugins: 201902181742 ckeen@blackbeard:src/playground/opensmalltalk-vm

...
...
CoInterpreter VMMaker.oscog-eem.2509 uuid:

91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019

...
...
StackToRegisterMappingCogit VMMaker.oscog-eem.2509 uuid:

91e81f64-95de-4914-a960-8f842be3a194 Feb 19 2019

...
What Platform [os/cpu] is this on?

Linux/amd64 sorry for omitting this...

and you report that -100 asFloat answers 100.0.

Can you please try the following experiments?

First what is the result of

| floats | floats := Array new: 30. 1 to: floats size do: [:i| floats at: i put: -100 asFloat]. floats

If the result has the first several floats as 100.0 and the last few as -100.0 then...

Edit spur64src/vm/gcc3x-cointerp.c so that

static void primitiveAsFloat(void) { DECL_MAYBE_SQ_GLOBAL_STRUCT sqInt rcvr; char *sp;

rcvr = longAt(GIV(stackPointer)); assert((((rcvr) & 7) == 1)); /* begin pop:thenPushFloat: */ longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); GIV(stackPointer) = sp; }

reads

static void primitiveAsFloat(void) { DECL_MAYBE_SQ_GLOBAL_STRUCT sqInt rcvr; char *sp;

and test -100 asFloat. Does this fix it?

And does this rewrite fix it?

static void primitiveAsFloat(void) { DECL_MAYBE_SQ_GLOBAL_STRUCT sqInt rcvr; char *sp;

(I would expect not)

_,,,^..^,,,_ best, Eliot

Christian Kellermann

9:38 p.m.

Hi Eliot!

* Eliot Miranda eliot.miranda@gmail.com [190219 18:18]:

...

and you report that -100 asFloat answers 100.0. Can you please try the following experiments?

I will do this on the work machine tomorrow (in 12 hours roughly). I have just tested this on OpenBSD/amd64 with its old gcc 4.2.1 and here it works as expected. I think on the other machine I am using arch linux' gcc7, so there might be a too aggressive optimizer at work. I will report back.

Thanks!

Christian

-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.

Christian Kellermann

20 Feb 20 Feb

8:30 a.m.

* Eliot Miranda eliot.miranda@gmail.com [190219 18:18]:

...

First what is the result of

| floats |
floats := Array new: 30.
1 to: floats size do: [:i| floats at: i put: -100 asFloat].
floats

#(100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0)

All 100's...

...

If the result has the first several floats as 100.0 and the last few as -100.0 then...

omitted.

...

static void primitiveAsFloat(void) { DECL_MAYBE_SQ_GLOBAL_STRUCT sqInt rcvr; char *sp;
rcvr = longAt(GIV(stackPointer));
assert((((rcvr) & 7) == 1));
/* begin pop:thenPushFloat: */
longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)),
floatObjectOf(((double) (((sqInt)rcvr >> 3)) ))); GIV(stackPointer) = sp; }

(I would expect not)

You are right:

diff --git a/spur64src/vm/gcc3x-cointerp.c b/spur64src/vm/gcc3x-cointerp.c index c80be332c..0256a6585 100644 --- a/spur64src/vm/gcc3x-cointerp.c +++ b/spur64src/vm/gcc3x-cointerp.c @@ -25996,7 +25996,7 @@ primitiveAsFloat(void) rcvr = longAt(GIV(stackPointer)); assert((((rcvr) & 7) == 1)); /* begin pop:thenPushFloat: */ - longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); + longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) (((sqInt)rcvr >> 3)) ))); GIV(stackPointer) = sp; }

-100 asFloat → 100

This has been built with $ gcc --version gcc (GCC) 8.2.1 20181127

When I disable optimisations with -O0, I get the correct result. So I guess, let's see why the compiler does what it does...

Kind regards,

Christian

-- May you be peaceful, may you live in safety, may you be free from suffering, and may you live with ease.

Levente Uzonyi

22 Feb 22 Feb

12:43 a.m.

I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.

GCC 4.8:

0000000000432ff0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 432ff0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 432ff1: 48 8b 1d 20 24 36 00 mov 0x362420(%rip),%rbx # 795418 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 432ff8: 48 8b 03 mov (%rbx),%rax 432ffb: 48 c1 f8 03 sar $0x3,%rax 432fff: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 433004: e8 37 ee ff ff callq 431e40 <floatObjectOf> 433009: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 43300c: 48 89 1d 05 24 36 00 mov %rbx,0x362405(%rip) # 795418 <stackPointer> } 433013: 5b pop %rbx 433014: c3 retq 433015: 90 nop 433016: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 43301d: 00 00 00

GCC 8.2:

000000000004a1f0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 4a1f0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 4a1f1: 48 8b 1d 80 32 36 00 mov 0x363280(%rip),%rbx # 3ad478 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 4a1f8: 66 0f ef c0 pxor %xmm0,%xmm0 4a1fc: 48 8b 03 mov (%rbx),%rax 4a1ff: 48 c1 f8 03 sar $0x3,%rax 4a203: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 4a208: e8 b3 eb ff ff callq 48dc0 <floatObjectOf> 4a20d: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 4a210: 48 89 1d 61 32 36 00 mov %rbx,0x363261(%rip) # 3ad478 <stackPointer> } 4a217: 5b pop %rbx 4a218: c3 retq 4a219: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.

So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.

Levente

Tobias Pape

8:35 a.m.

Hi Levente.

...

On 22.02.2019, at 00:43, Levente Uzonyi leves@caesar.elte.hu wrote:

I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.

GCC 4.8:

0000000000432ff0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 432ff0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 432ff1: 48 8b 1d 20 24 36 00 mov 0x362420(%rip),%rbx # 795418 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 432ff8: 48 8b 03 mov (%rbx),%rax 432ffb: 48 c1 f8 03 sar $0x3,%rax 432fff: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 433004: e8 37 ee ff ff callq 431e40 <floatObjectOf> 433009: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 43300c: 48 89 1d 05 24 36 00 mov %rbx,0x362405(%rip) # 795418 <stackPointer> } 433013: 5b pop %rbx 433014: c3 retq 433015: 90 nop 433016: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 43301d: 00 00 00

GCC 8.2:

000000000004a1f0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 4a1f0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 4a1f1: 48 8b 1d 80 32 36 00 mov 0x363280(%rip),%rbx # 3ad478 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 4a1f8: 66 0f ef c0 pxor %xmm0,%xmm0 4a1fc: 48 8b 03 mov (%rbx),%rax 4a1ff: 48 c1 f8 03 sar $0x3,%rax 4a203: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 4a208: e8 b3 eb ff ff callq 48dc0 <floatObjectOf> 4a20d: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 4a210: 48 89 1d 61 32 36 00 mov %rbx,0x363261(%rip) # 3ad478 <stackPointer> } 4a217: 5b pop %rbx 4a218: c3 retq 4a219: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.

So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.

Wow. Great analysis! Thank you :)

Best regards -Tobias

Levente Uzonyi

10:58 a.m.

Thanks. I spent some time in gdb land and there seems to be a miscompiled conditional jump. The question is: is it a compiler bug or a bug in the Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is generated.

Here is the disassembled code with my comments following the path when -100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt: 0x0000555555589ec0 <+0>: mov 0x3775b1(%rip),%rdx # 0x555555901478 <stackPointer> %rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively). 0x0000555555589ec7 <+7>: mov (%rdx),%rax %rcx is the receiver object's address 0x0000555555589eca <+10>: mov 0x8(%rdx),%rcx Is the index SmallInteger 1? 0x0000555555589ece <+14>: cmp $0x9,%rax If yes, jump to 0x555555589ef8 0x0000555555589ed2 <+18>: je 0x555555589ef8 <primitiveFloatAt+56> 0x0000555555589ed4 <+20>: cmp $0x11,%rax 0x0000555555589ed8 <+24>: je 0x555555589f30 <primitiveFloatAt+112> 0x0000555555589eda <+26>: and $0x7,%eax 0x0000555555589edd <+29>: cmp $0x1,%rax 0x0000555555589ee1 <+33>: sete %al 0x0000555555589ee4 <+36>: movzbl %al,%eax 0x0000555555589ee7 <+39>: add $0x3,%rax 0x0000555555589eeb <+43>: mov %rax,0x37757e(%rip) # 0x555555901470 <primFailCode> 0x0000555555589ef2 <+50>: retq 0x0000555555589ef3 <+51>: nopl 0x0(%rax,%rax,1) Set the value of %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important. 0x0000555555589ef8 <+56>: xor %eax,%eax Check if the receiver has the SmallFloat tag set (0x4) 0x0000555555589efa <+58>: test $0x4,%cl If yes, jump to 0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0. => 0x0000555555589efd <+61>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 2nd 32-bit field of the receiver into %rax with sign extension? 0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax Shift by 3 bits to make room for the tag. 0x0000555555589f03 <+67>: shl $0x3,%rax Adjust the stack pointer. 0x0000555555589f07 <+71>: add $0x8,%rdx Prepare a 32-bit mask shifted left by 3 bits for the tag. 0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx Mask the result. 0x0000555555589f15 <+85>: and %rcx,%rax Add the SmallInteger tag. 0x0000555555589f18 <+88>: or $0x1,%rax Store the result on the stack. 0x0000555555589f1c <+92>: mov %rax,(%rdx) 0x0000555555589f1f <+95>: mov %rdx,0x377552(%rip) # 0x555555901478 <stackPointer> 0x0000555555589f26 <+102>: retq 0x0000555555589f27 <+103>: nopw 0x0(%rax,%rax,1) 0x0000555555589f30 <+112>: xor %eax,%eax 0x0000555555589f32 <+114>: test $0x4,%cl 0x0000555555589f35 <+117>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 1st 32-bit field of the receiver into %rax with sign extension? 0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax 0x0000555555589f3b <+123>: jmp 0x555555589f03 <primitiveFloatAt+67> End of assembler dump.

Levente

On Fri, 22 Feb 2019, Tobias Pape wrote:

...

Hi Levente.

...
On 22.02.2019, at 00:43, Levente Uzonyi leves@caesar.elte.hu wrote:

I had a look at the generated code and I think the generated code for primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.

GCC 4.8:

0000000000432ff0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 432ff0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 432ff1: 48 8b 1d 20 24 36 00 mov 0x362420(%rip),%rbx # 795418 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 432ff8: 48 8b 03 mov (%rbx),%rax 432ffb: 48 c1 f8 03 sar $0x3,%rax 432fff: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 433004: e8 37 ee ff ff callq 431e40 <floatObjectOf> 433009: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 43300c: 48 89 1d 05 24 36 00 mov %rbx,0x362405(%rip) # 795418 <stackPointer> } 433013: 5b pop %rbx 433014: c3 retq 433015: 90 nop 433016: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 43301d: 00 00 00

GCC 8.2:

000000000004a1f0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 4a1f0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 4a1f1: 48 8b 1d 80 32 36 00 mov 0x363280(%rip),%rbx # 3ad478 <stackPointer> longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)), floatObjectOf(((double) ((rcvr >> 3)) ))); 4a1f8: 66 0f ef c0 pxor %xmm0,%xmm0 4a1fc: 48 8b 03 mov (%rbx),%rax 4a1ff: 48 c1 f8 03 sar $0x3,%rax 4a203: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 4a208: e8 b3 eb ff ff callq 48dc0 <floatObjectOf> 4a20d: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 4a210: 48 89 1d 61 32 36 00 mov %rbx,0x363261(%rip) # 3ad478 <stackPointer> } 4a217: 5b pop %rbx 4a218: c3 retq 4a219: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

Actually the returned float object is -100.0 (-100 asFloat < 0.0 returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.

So, we're back to the original issue I found: #basicAt: (primitive 38) doesn't work with newer gcc versions.

Wow. Great analysis! Thank you :)

Best regards -Tobias

Eliot Miranda

9:02 p.m.

Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi leves@caesar.elte.hu wrote:

...

Thanks. I spent some time in gdb land and there seems to be a miscompiled conditional jump. The question is: is it a compiler bug or a bug in the Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is generated.

Here is the disassembled code with my comments following the path when -100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt: 0x0000555555589ec0 <+0>: mov 0x3775b1(%rip),%rdx # 0x555555901478 <stackPointer> %rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively). 0x0000555555589ec7 <+7>: mov (%rdx),%rax %rcx is the receiver object's address 0x0000555555589eca <+10>: mov 0x8(%rdx),%rcx Is the index SmallInteger 1? 0x0000555555589ece <+14>: cmp $0x9,%rax If yes, jump to 0x555555589ef8 0x0000555555589ed2 <+18>: je 0x555555589ef8 <primitiveFloatAt+56> 0x0000555555589ed4 <+20>: cmp $0x11,%rax 0x0000555555589ed8 <+24>: je 0x555555589f30 <primitiveFloatAt+112> 0x0000555555589eda <+26>: and $0x7,%eax 0x0000555555589edd <+29>: cmp $0x1,%rax 0x0000555555589ee1 <+33>: sete %al 0x0000555555589ee4 <+36>: movzbl %al,%eax 0x0000555555589ee7 <+39>: add $0x3,%rax 0x0000555555589eeb <+43>: mov %rax,0x37757e(%rip) # 0x555555901470 <primFailCode> 0x0000555555589ef2 <+50>: retq 0x0000555555589ef3 <+51>: nopl 0x0(%rax,%rax,1) Set the value of %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important. 0x0000555555589ef8 <+56>: xor %eax,%eax Check if the receiver has the SmallFloat tag set (0x4) 0x0000555555589efa <+58>: test $0x4,%cl If yes, jump to 0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0. => 0x0000555555589efd <+61>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 2nd 32-bit field of the receiver into %rax with sign extension? 0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax Shift by 3 bits to make room for the tag. 0x0000555555589f03 <+67>: shl $0x3,%rax Adjust the stack pointer. 0x0000555555589f07 <+71>: add $0x8,%rdx Prepare a 32-bit mask shifted left by 3 bits for the tag. 0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx Mask the result. 0x0000555555589f15 <+85>: and %rcx,%rax Add the SmallInteger tag. 0x0000555555589f18 <+88>: or $0x1,%rax Store the result on the stack. 0x0000555555589f1c <+92>: mov %rax,(%rdx) 0x0000555555589f1f <+95>: mov %rdx,0x377552(%rip) # 0x555555901478 <stackPointer> 0x0000555555589f26 <+102>: retq 0x0000555555589f27 <+103>: nopw 0x0(%rax,%rax,1) 0x0000555555589f30 <+112>: xor %eax,%eax 0x0000555555589f32 <+114>: test $0x4,%cl 0x0000555555589f35 <+117>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 1st 32-bit field of the receiver into %rax with sign extension? 0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax 0x0000555555589f3b <+123>: jmp 0x555555589f03 <primitiveFloatAt+67> End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer. However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:

/* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */ static sqInt NoDbgRegParms fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop) { usqLong bits; usqLong rot;

if (!(oop & (smallFloatTag()))) { return long32At((oop + BaseHeaderSize) + (((sqInt)((usqInt)(fieldIndex) << 2)))); } /* begin smallFloatBitsOf: */ assert(isImmediateFloat(oop)); rot = ((usqInt) (((usqInt)oop))) >> (numTagBits()); if (rot > 1) {

/* a.k.a. ~= +/-0.0 */ rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) << ((smallFloatMantissaBits()) + 1))); } /* begin rotateRight: */ rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1); bits = rot; return (((int *) ((&bits))))[fieldIndex]; }

Note that [+/-]100.0 is an immediate float. Nicolas, is this kind of pointer punning still OK? Notice that at no time do we pun the bits to a double. They are always integral bits. What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C' where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.

...

Levente

On Fri, 22 Feb 2019, Tobias Pape wrote:

...
Hi Levente.

...
On 22.02.2019, at 00:43, Levente Uzonyi leves@caesar.elte.hu wrote:

I had a look at the generated code and I think the generated code for

primitiveAsFloat has no issues. The machine code is simple and looks correct with gcc 8, thought the pxor operation is unnecessary (and not generated by gcc 4) as cvtsi2sd will zero fill the higher 64 bits of xmm0.

...
...
GCC 4.8:

0000000000432ff0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 432ff0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 432ff1: 48 8b 1d 20 24 36 00 mov 0x362420(%rip),%rbx

# 795418 <stackPointer>

...
...
   longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)),
floatObjectOf(((double) ((rcvr >> 3)) )));

...
...
432ff8: 48 8b 03 mov (%rbx),%rax 432ffb: 48 c1 f8 03 sar $0x3,%rax 432fff: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 433004: e8 37 ee ff ff callq 431e40 <floatObjectOf> 433009: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 43300c: 48 89 1d 05 24 36 00 mov %rbx,0x362405(%rip)

# 795418 <stackPointer>

...
...
} 433013: 5b pop %rbx 433014: c3 retq 433015: 90 nop 433016: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 43301d: 00 00 00

GCC 8.2:

000000000004a1f0 <primitiveAsFloat>: { DECL_MAYBE_SQ_GLOBAL_STRUCT 4a1f0: 53 push %rbx rcvr = longAt(GIV(stackPointer)); 4a1f1: 48 8b 1d 80 32 36 00 mov 0x363280(%rip),%rbx

# 3ad478 <stackPointer>

...
...
   longAtput((sp = GIV(stackPointer) + ((1 - 1) * BytesPerWord)),
floatObjectOf(((double) ((rcvr >> 3)) )));

...
...
4a1f8: 66 0f ef c0 pxor %xmm0,%xmm0 4a1fc: 48 8b 03 mov (%rbx),%rax 4a1ff: 48 c1 f8 03 sar $0x3,%rax 4a203: f2 48 0f 2a c0 cvtsi2sd %rax,%xmm0 4a208: e8 b3 eb ff ff callq 48dc0 <floatObjectOf> 4a20d: 48 89 03 mov %rax,(%rbx) GIV(stackPointer) = sp; 4a210: 48 89 1d 61 32 36 00 mov %rbx,0x363261(%rip)

# 3ad478 <stackPointer>

...
...
} 4a217: 5b pop %rbx 4a218: c3 retq 4a219: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)

Actually the returned float object is -100.0 (-100 asFloat < 0.0

returns true), but the code used for printing doesn't work properly, because #basicAt: returns 0 (when the argument is 1 or 2), and the method used to print the number uses #signBit to print the negative sign, and #signBit relies on #basicAt: returning a non-zero value when the number is negative.

...
...
So, we're back to the original issue I found: #basicAt: (primitive 38)

doesn't work with newer gcc versions.

...
...
Wow. Great analysis! Thank you :)

Best regards -Tobias

-- _,,,^..^,,,_ best, Eliot

Nicolas Cellier

24 Feb 24 Feb

9:15 p.m.

Hi Eliot, see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda eliot.miranda@gmail.com a écrit :

...

Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi leves@caesar.elte.hu wrote:

...
Thanks. I spent some time in gdb land and there seems to be a miscompiled conditional jump. The question is: is it a compiler bug or a bug in the Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is generated.

Here is the disassembled code with my comments following the path when -100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt: 0x0000555555589ec0 <+0>: mov 0x3775b1(%rip),%rdx # 0x555555901478 <stackPointer> %rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively). 0x0000555555589ec7 <+7>: mov (%rdx),%rax %rcx is the receiver object's address 0x0000555555589eca <+10>: mov 0x8(%rdx),%rcx Is the index SmallInteger 1? 0x0000555555589ece <+14>: cmp $0x9,%rax If yes, jump to 0x555555589ef8 0x0000555555589ed2 <+18>: je 0x555555589ef8 <primitiveFloatAt+56> 0x0000555555589ed4 <+20>: cmp $0x11,%rax 0x0000555555589ed8 <+24>: je 0x555555589f30 <primitiveFloatAt+112> 0x0000555555589eda <+26>: and $0x7,%eax 0x0000555555589edd <+29>: cmp $0x1,%rax 0x0000555555589ee1 <+33>: sete %al 0x0000555555589ee4 <+36>: movzbl %al,%eax 0x0000555555589ee7 <+39>: add $0x3,%rax 0x0000555555589eeb <+43>: mov %rax,0x37757e(%rip) # 0x555555901470 <primFailCode> 0x0000555555589ef2 <+50>: retq 0x0000555555589ef3 <+51>: nopl 0x0(%rax,%rax,1) Set the value of %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important. 0x0000555555589ef8 <+56>: xor %eax,%eax Check if the receiver has the SmallFloat tag set (0x4) 0x0000555555589efa <+58>: test $0x4,%cl If yes, jump to 0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0. => 0x0000555555589efd <+61>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 2nd 32-bit field of the receiver into %rax with sign extension? 0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax Shift by 3 bits to make room for the tag. 0x0000555555589f03 <+67>: shl $0x3,%rax Adjust the stack pointer. 0x0000555555589f07 <+71>: add $0x8,%rdx Prepare a 32-bit mask shifted left by 3 bits for the tag. 0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx Mask the result. 0x0000555555589f15 <+85>: and %rcx,%rax Add the SmallInteger tag. 0x0000555555589f18 <+88>: or $0x1,%rax Store the result on the stack. 0x0000555555589f1c <+92>: mov %rax,(%rdx) 0x0000555555589f1f <+95>: mov %rdx,0x377552(%rip) # 0x555555901478 <stackPointer> 0x0000555555589f26 <+102>: retq 0x0000555555589f27 <+103>: nopw 0x0(%rax,%rax,1) 0x0000555555589f30 <+112>: xor %eax,%eax 0x0000555555589f32 <+114>: test $0x4,%cl 0x0000555555589f35 <+117>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 1st 32-bit field of the receiver into %rax with sign extension? 0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax 0x0000555555589f3b <+123>: jmp 0x555555589f03 <primitiveFloatAt+67> End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer. However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:
/* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop) { usqLong bits; usqLong rot;
if (!(oop & (smallFloatTag()))) {
    return long32At((oop + BaseHeaderSize) +
(((sqInt)((usqInt)(fieldIndex) << 2)))); } /* begin smallFloatBitsOf: */ assert(isImmediateFloat(oop)); rot = ((usqInt) (((usqInt)oop))) >> (numTagBits()); if (rot > 1) {
    /* a.k.a. ~= +/-0.0 */
    rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) <<
((smallFloatMantissaBits()) + 1))); } /* begin rotateRight: */ rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1); bits = rot; return (((int *) ((&bits))))[fieldIndex]; }

Note that [+/-]100.0 is an immediate float. Nicolas, is this kind of pointer punning still OK? Notice that at no time do we pun the bits to a double. They are always integral bits. What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C' where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.

In theory, pointer type aliasing is Undefined Behavior. Though, I don't well see what kind of optimization a compiler could perform here... Try one of the 2 solid choices for type punning: - memcpy - union (at least in C it's legal, I'm not even sure for C++ and don't have time to search now) or you may try the more fragile -fno-strict-aliasing (or something like that)

Note that a clever enough compiler should not invoke memcpy at all in simple cases like this. (If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

-- > _,,,^..^,,,_ > best, Eliot >

Eliot Miranda

9:52 p.m.

On Sun, Feb 24, 2019 at 12:16 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

Hi Eliot, see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda eliot.miranda@gmail.com a écrit :

...
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi leves@caesar.elte.hu wrote:

...
Thanks. I spent some time in gdb land and there seems to be a miscompiled conditional jump. The question is: is it a compiler bug or a bug in the Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is generated.

Here is the disassembled code with my comments following the path when -100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt: 0x0000555555589ec0 <+0>: mov 0x3775b1(%rip),%rdx # 0x555555901478 <stackPointer> %rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively). 0x0000555555589ec7 <+7>: mov (%rdx),%rax %rcx is the receiver object's address 0x0000555555589eca <+10>: mov 0x8(%rdx),%rcx Is the index SmallInteger 1? 0x0000555555589ece <+14>: cmp $0x9,%rax If yes, jump to 0x555555589ef8 0x0000555555589ed2 <+18>: je 0x555555589ef8 <primitiveFloatAt+56> 0x0000555555589ed4 <+20>: cmp $0x11,%rax 0x0000555555589ed8 <+24>: je 0x555555589f30 <primitiveFloatAt+112> 0x0000555555589eda <+26>: and $0x7,%eax 0x0000555555589edd <+29>: cmp $0x1,%rax 0x0000555555589ee1 <+33>: sete %al 0x0000555555589ee4 <+36>: movzbl %al,%eax 0x0000555555589ee7 <+39>: add $0x3,%rax 0x0000555555589eeb <+43>: mov %rax,0x37757e(%rip) # 0x555555901470 <primFailCode> 0x0000555555589ef2 <+50>: retq 0x0000555555589ef3 <+51>: nopl 0x0(%rax,%rax,1) Set the value of %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important. 0x0000555555589ef8 <+56>: xor %eax,%eax Check if the receiver has the SmallFloat tag set (0x4) 0x0000555555589efa <+58>: test $0x4,%cl If yes, jump to 0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0. => 0x0000555555589efd <+61>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 2nd 32-bit field of the receiver into %rax with sign extension? 0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax Shift by 3 bits to make room for the tag. 0x0000555555589f03 <+67>: shl $0x3,%rax Adjust the stack pointer. 0x0000555555589f07 <+71>: add $0x8,%rdx Prepare a 32-bit mask shifted left by 3 bits for the tag. 0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx Mask the result. 0x0000555555589f15 <+85>: and %rcx,%rax Add the SmallInteger tag. 0x0000555555589f18 <+88>: or $0x1,%rax Store the result on the stack. 0x0000555555589f1c <+92>: mov %rax,(%rdx) 0x0000555555589f1f <+95>: mov %rdx,0x377552(%rip) # 0x555555901478 <stackPointer> 0x0000555555589f26 <+102>: retq 0x0000555555589f27 <+103>: nopw 0x0(%rax,%rax,1) 0x0000555555589f30 <+112>: xor %eax,%eax 0x0000555555589f32 <+114>: test $0x4,%cl 0x0000555555589f35 <+117>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 1st 32-bit field of the receiver into %rax with sign extension? 0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax 0x0000555555589f3b <+123>: jmp 0x555555589f03 <primitiveFloatAt+67> End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer. However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:
/* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop) { usqLong bits; usqLong rot;
if (!(oop & (smallFloatTag()))) {
    return long32At((oop + BaseHeaderSize) +
(((sqInt)((usqInt)(fieldIndex) << 2)))); } /* begin smallFloatBitsOf: */ assert(isImmediateFloat(oop)); rot = ((usqInt) (((usqInt)oop))) >> (numTagBits()); if (rot > 1) {
    /* a.k.a. ~= +/-0.0 */
    rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) <<
((smallFloatMantissaBits()) + 1))); } /* begin rotateRight: */ rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1); bits = rot; return (((int *) ((&bits))))[fieldIndex]; }

Note that [+/-]100.0 is an immediate float. Nicolas, is this kind of pointer punning still OK? Notice that at no time do we pun the bits to a double. They are always integral bits. What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C' where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.
In theory, pointer type aliasing is Undefined Behavior. Though, I don't well see what kind of optimization a compiler could perform here... Try one of the 2 solid choices for type punning:

memcpy

union (at least in C it's legal, I'm not even sure for C++ and don't

have time to search now) or you may try the more fragile -fno-strict-aliasing (or something like that)

I'll rewrite to use a union.

...

Note that a clever enough compiler should not invoke memcpy at all in simple cases like this. (If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

--

...
_,,,^..^,,,_ best, Eliot

-- _,,,^..^,,,_ best, Eliot

Eliot Miranda

9:56 p.m.

On Sun, Feb 24, 2019 at 12:52 PM Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Sun, Feb 24, 2019 at 12:16 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
Hi Eliot, see below...

Le ven. 22 févr. 2019 à 21:03, Eliot Miranda eliot.miranda@gmail.com a écrit :

...
Hi Levente,

On Fri, Feb 22, 2019 at 1:58 AM Levente Uzonyi leves@caesar.elte.hu wrote:

...
Thanks. I spent some time in gdb land and there seems to be a miscompiled conditional jump. The question is: is it a compiler bug or a bug in the Smalltalk -> C translation. With gcc 4.8, the correct jump instruction is generated.

Here is the disassembled code with my comments following the path when -100.0 basicAt: 1 is evaluated in the image:

Dump of assembler code for function primitiveFloatAt: 0x0000555555589ec0 <+0>: mov 0x3775b1(%rip),%rdx # 0x555555901478 <stackPointer> %rax will hold the index (1 or 2) as a SmallInteger (so the actual value will be 0x9 or 0x11 for 1 and 2 respectively). 0x0000555555589ec7 <+7>: mov (%rdx),%rax %rcx is the receiver object's address 0x0000555555589eca <+10>: mov 0x8(%rdx),%rcx Is the index SmallInteger 1? 0x0000555555589ece <+14>: cmp $0x9,%rax If yes, jump to 0x555555589ef8 0x0000555555589ed2 <+18>: je 0x555555589ef8 <primitiveFloatAt+56> 0x0000555555589ed4 <+20>: cmp $0x11,%rax 0x0000555555589ed8 <+24>: je 0x555555589f30 <primitiveFloatAt+112> 0x0000555555589eda <+26>: and $0x7,%eax 0x0000555555589edd <+29>: cmp $0x1,%rax 0x0000555555589ee1 <+33>: sete %al 0x0000555555589ee4 <+36>: movzbl %al,%eax 0x0000555555589ee7 <+39>: add $0x3,%rax 0x0000555555589eeb <+43>: mov %rax,0x37757e(%rip) # 0x555555901470 <primFailCode> 0x0000555555589ef2 <+50>: retq 0x0000555555589ef3 <+51>: nopl 0x0(%rax,%rax,1) Set the value of %eax to 0 for seemingly no reason yet. The returned SmallInteger will be computed in %rax, so this is important. 0x0000555555589ef8 <+56>: xor %eax,%eax Check if the receiver has the SmallFloat tag set (0x4) 0x0000555555589efa <+58>: test $0x4,%cl If yes, jump to 0x555555589f03. This is the miscompiled conditional jump. It should go to 0x0000555555589f37, but it will just skip the next instruction leaving %eax set to 0. => 0x0000555555589efd <+61>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 2nd 32-bit field of the receiver into %rax with sign extension? 0x0000555555589eff <+63>: movslq 0xc(%rcx),%rax Shift by 3 bits to make room for the tag. 0x0000555555589f03 <+67>: shl $0x3,%rax Adjust the stack pointer. 0x0000555555589f07 <+71>: add $0x8,%rdx Prepare a 32-bit mask shifted left by 3 bits for the tag. 0x0000555555589f0b <+75>: movabs $0x7fffffff8,%rcx Mask the result. 0x0000555555589f15 <+85>: and %rcx,%rax Add the SmallInteger tag. 0x0000555555589f18 <+88>: or $0x1,%rax Store the result on the stack. 0x0000555555589f1c <+92>: mov %rax,(%rdx) 0x0000555555589f1f <+95>: mov %rdx,0x377552(%rip) # 0x555555901478 <stackPointer> 0x0000555555589f26 <+102>: retq 0x0000555555589f27 <+103>: nopw 0x0(%rax,%rax,1) 0x0000555555589f30 <+112>: xor %eax,%eax 0x0000555555589f32 <+114>: test $0x4,%cl 0x0000555555589f35 <+117>: jne 0x555555589f03 <primitiveFloatAt+67> Copy the 1st 32-bit field of the receiver into %rax with sign extension? 0x0000555555589f37 <+119>: movslq 0x8(%rcx),%rax 0x0000555555589f3b <+123>: jmp 0x555555589f03 <primitiveFloatAt+67> End of assembler dump.

I'm pretty sure the C code is correct, but Nicolas is our C lawyer. However we need to look at both primitiveFloatAt and fetchLong32:ofFloatObject:. Here's the source:
/* Spur64BitMemoryManager>>#fetchLong32:ofFloatObject: */
static sqInt NoDbgRegParms fetchLong32ofFloatObject(sqInt fieldIndex, sqInt oop) { usqLong bits; usqLong rot;
if (!(oop & (smallFloatTag()))) {
    return long32At((oop + BaseHeaderSize) +
(((sqInt)((usqInt)(fieldIndex) << 2)))); } /* begin smallFloatBitsOf: */ assert(isImmediateFloat(oop)); rot = ((usqInt) (((usqInt)oop))) >> (numTagBits()); if (rot > 1) {
    /* a.k.a. ~= +/-0.0 */
    rot += ((sqInt)((usqInt)((smallFloatExponentOffset())) <<
((smallFloatMantissaBits()) + 1))); } /* begin rotateRight: */ rot = (rot << 0x3F) + (((usqInt) (((usqInt)rot))) >> 1); bits = rot; return (((int *) ((&bits))))[fieldIndex]; }

Note that [+/-]100.0 is an immediate float. Nicolas, is this kind of pointer punning still OK? Notice that at no time do we pun the bits to a double. They are always integral bits. What we're doing is manipulating the bit pattern for -100.0, an immediate SmallFloat64. (Spur64BitMemoryManager new smallFloatObjectOf: -100.0) hex '16r859000000000000C' where 16r4 is the tag pattern for an immediate float and 16r8 is the sign bit in the least significant position.
In theory, pointer type aliasing is Undefined Behavior. Though, I don't well see what kind of optimization a compiler could perform here... Try one of the 2 solid choices for type punning:

memcpy

union (at least in C it's legal, I'm not even sure for C++ and don't

have time to search now) or you may try the more fragile -fno-strict-aliasing (or something like that)
I'll rewrite to use a union.

Ugh, not yet. That would be so much uglier and m ore complicated. Let's pin down where the bug is first.

...

...
Note that a clever enough compiler should not invoke memcpy at all in simple cases like this. (If it is clever enough to misscompile this simple case of pointer aliasing, I don't see why it wouldn't for memcpy).

+1.

_,,,^..^,,,_ best, Eliot

Levente Uzonyi

19 Feb 19 Feb

5:40 p.m.

I've seen this bug before[1] when building the VM with gcc 7/8. Probably undefined behavior somewhere, but I didn't have time to find the cause.

Levente

[1] http://forum.world.st/ftlo-option-link-time-optimizer-tp5092657p5092693.html

On Tue, 19 Feb 2019, Christian Kellermann wrote:

1911

Age (days ago)

1916

Last active (days ago)

vm-dev@lists.squeakfoundation.org

13 comments

5 participants

tags (0)

participants (5)

Christian Kellermann
Eliot Miranda
Levente Uzonyi
Nicolas Cellier
Tobias Pape