vm problem on cog an stack v3

List overview All Threads
Download

newer

older

Re: [Vm-dev] VM Maker:...

64-bit WideString access (was...

Nicolas Cellier

27 Mar 2016 27 Mar '16

12:40 a.m.

Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons: - we throw the signBit extension away - slang inlining misses the signedness difference, and the generated C code is correct. - Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect - from which version does this happens - for which OS - for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

Attachments:

attachment.html (text/html — 3.6 KB)

Show replies by date

Nicolas Cellier

27 Mar 27 Mar

3:39 p.m.

To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to: - iterate on version number (whatever strategy, bsearch or something) - checkout VM sources - compile the build.favouriteOS/squeak.stack.v3 - run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

2016-03-27 0:40 GMT+01:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:

...

Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C

code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

Nicolas Cellier

29 Mar 29 Mar

11:36 p.m.

2016-03-27 15:39 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:

...

To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to avoid a segfault related to misalign fetch.

OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex << 3)] ifFalse: [self cppIf: VMBIGENDIAN ifTrue: [((self long32At: oop + self baseHeaderSize + (longIndex << 3)) asUnsignedLongLong << 32) + (self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4))] ifFalse: [(self long32At: oop + self baseHeaderSize + (longIndex << 3)) + ((self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about signedness... I would much much much prefer to call unsignedLong32At because it's what I mean 9 times out of 10: get the magnitude...

2016-03-27 0:40 GMT+01:00 Nicolas Cellier <

...

nicolas.cellier.aka.nice@gmail.com>:

...
Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C

code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

David T. Lewis

30 Mar 30 Mar

1:19 a.m.

On Tue, Mar 29, 2016 at 11:36:57PM +0200, Nicolas Cellier wrote:

...

2016-03-27 15:39 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:

...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to avoid a segfault related to misalign fetch.

OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex << 3)] ifFalse: [self cppIf: VMBIGENDIAN ifTrue: [((self long32At: oop + self baseHeaderSize + (longIndex << 3)) asUnsignedLongLong << 32) + (self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4))] ifFalse: [(self long32At: oop + self baseHeaderSize + (longIndex << 3)) + ((self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

Similar issues must exist in other places. The trunk VMM (aka interpreter) that I tested has no #fetchLong64:ofObject: but it had the same failure symptoms that you saw with StackInterpreter/Cog/Spur.

...

That cost me many hours, but that enforce my conviction about signedness... I would much much much prefer to call unsignedLong32At because it's what I mean 9 times out of 10: get the magnitude...

Debugging problems like this is hard. Declaring variables properly is easy.

Dave

...

2016-03-27 0:40 GMT+01:00 Nicolas Cellier <

...
nicolas.cellier.aka.nice@gmail.com>:

...
Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C

code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

Eliot Miranda

8:09 a.m.

Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

2016-03-27 15:39 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:

...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to avoid a segfault related to misalign fetch.

OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex << 3)] ifFalse: [self cppIf: VMBIGENDIAN ifTrue: [((self long32At: oop + self baseHeaderSize + (longIndex << 3)) asUnsignedLongLong << 32) + (self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4))] ifFalse: [(self long32At: oop + self baseHeaderSize + (longIndex << 3)) + ((self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about signedness... I would much much much prefer to call unsignedLong32At because it's what I mean 9 times out of 10: get the magnitude...

Let's add them. It would be great to have them all consistent too. But at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add signedLongAt:[put:] et al. c) make the simulator's longAt:[put:] long32At:put: signed and then add unsignedLongAt:[put:] et al

Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I can join your experience in finding that debugging these issues is extremely time consuming.

My preference is for b); some uses where the type should be signed will by located by C compiler warnings; we hope that VM tests will catch the others)

...

2016-03-27 0:40 GMT+01:00 Nicolas Cellier <

...
nicolas.cellier.aka.nice@gmail.com>:

...
Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C

code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

-- _,,,^..^,,,_ best, Eliot

Ben Coman

1:46 p.m.

On Wed, Mar 30, 2016 at 2:09 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...
2016-03-27 15:39 GMT+02:00 Nicolas Cellier nicolas.cellier.aka.nice@gmail.com:

...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to avoid a segfault related to misalign fetch.

OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex << 3)] ifFalse: [self cppIf: VMBIGENDIAN ifTrue: [((self long32At: oop + self baseHeaderSize + (longIndex << 3)) asUnsignedLongLong << 32) + (self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4))] ifFalse: [(self long32At: oop + self baseHeaderSize + (longIndex << 3)) + ((self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about signedness... I would much much much prefer to call unsignedLong32At because it's what I mean 9 times out of 10: get the magnitude...

Let's add them. It would be great to have them all consistent too. But at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add signedLongAt:[put:] et al. c) make the simulator's longAt:[put:] long32At:put: signed and then add unsignedLongAt:[put:] et al

Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I can join your experience in finding that debugging these issues is extremely time consuming.

My preference is for b); some uses where the type should be signed will by located by C compiler warnings; we hope that VM tests will catch the others)

Wouldn't (b) risk occasional cognitive dysfunction switching between the semantic of aa C long being signed and a Slang long being unsigned? (c) seems better long term, but switching the signedness of the an existing simulation method might cause short term instability and need to be tackled in one sitting? So a naive question... Why leave one implicit? Consider d) introducing both signedLongAt:[put:] and unsignedLongAt:[put:] and gradual migration.

cheers -ben

...

...
...
2016-03-27 0:40 GMT+01:00 Nicolas Cellier nicolas.cellier.aka.nice@gmail.com:

...
Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

-- _,,,^..^,,,_ best, Eliot

Nicolas Cellier

10:46 p.m.

2016-03-30 13:46 GMT+02:00 Ben Coman btc@openinworld.com:

...

On Wed, Mar 30, 2016 at 2:09 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com> wrote:

...
...
2016-03-27 15:39 GMT+02:00 Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com>:

...
...
...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests

fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

...
...
...
Then knowing from which SVN version of COG VM branch the KernelNumber

the tests start failing would be nice.

...
...
...
The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org

or anyother similar solution.

...
...
...
I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to

avoid a segfault related to misalign fetch.

...
...
OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex

<< 3)]

...
...
    ifFalse:
        [self cppIf: VMBIGENDIAN
            ifTrue: [((self long32At: oop + self baseHeaderSize +
(longIndex << 3)) asUnsignedLongLong << 32)

...
...
                + (self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4))]

...
...
            ifFalse: [(self long32At: oop + self baseHeaderSize +
(longIndex << 3))

...
...
                + ((self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

...
...
AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about

signedness...

...
...
I would much much much prefer to call unsignedLong32At because it's

what I mean 9 times out of 10: get the magnitude...

...
Let's add them. It would be great to have them all consistent too. But

at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

...
a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add

signedLongAt:[put:] et al.

...
c) make the simulator's longAt:[put:] long32At:put: signed and then add

unsignedLongAt:[put:] et al

...
Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I can

join your experience in finding that debugging these issues is extremely time consuming.

...
My preference is for b); some uses where the type should be signed will

by located by C compiler warnings; we hope that VM tests will catch the others)

Wouldn't (b) risk occasional cognitive dysfunction switching between the semantic of aa C long being signed and a Slang long being unsigned? (c) seems better long term, but switching the signedness of the an existing simulation method might cause short term instability and need to be tackled in one sitting? So a naive question... Why leave one implicit? Consider d) introducing both signedLongAt:[put:] and unsignedLongAt:[put:] and gradual migration.

cheers -ben

My first thought was as Eliot, use b) because - the simulator is already b) - wild guess is that there are more unsigned than signed usage

But I like Ben arguments better. Non uniform naming has great costs - slow learning ramp for noobs - seed for future bug due to misunderstanding.

For example we already have * long implicitely meaning 32 bits like in signedIntToLong signedIntFromLong, * long unconditionnally meaning 64 bits like sqLong * long meaning either 32 or 64 bits depending on architecture in generated C (like in StackInterperter>>putLong:toFile:) * long meaning either 32 or 64 depending on the context (#longAt: is allways 32 bits when sent to ByteArray Bitmap or CAccessor, but might be 64 bits when sent to simulator argh...)

I remember I had similar problems with word the first time I looked at VMMaker. WordArray being allways 32 bits word, but wordSize being 4 or 8 depending on target architecture. Even now, I'm not sure of which word we are talking of...

Also b) goes with a blitz strategy: - do a massive semantic change long32 -> unsigned - review the send sites and change a few variant to signed - repair broken C code once image start crashing and tests start failing - repair the simulator which might show different symptoms (see implementors of #asUnsignedLong) either it works soon, or never...

I've noted only 47 senders of long32At: . Plus the 33 senders of fetchLong32:ofObject: And cannot analyze easily the possible side effects thru Slang type inference...

As my recent bug of LargeInteger division showed: abuse of unsigned lead to expensive bug tracking too. That does a bit restrain my enthusiasm for solution b) - chat échaudé craint l'eau froide ;)

Last, I did not apply b) policy in my VM branch for 32 bits large int. Instead I introduced the unsigned variant gradually... (coward !)

Before we attempt anything, I'd like to remind the short term goals: - avoid uncontrolled sign extension when promoting to larger int in C - avoid undefined behavior, most of which being related to signed arithmetic - make the simulator better match C (or vice versa in the case)

The last goal is far away, there are subtle "differences"... Like in the simulator, the difference of two unsigned might be < 0. In C not, unless we force it (cast, type punning, whatever...).

Last point, gradual introduction and non-regression testing pre-suppose a stable VM. I still experiment random crashes I'd like to track first.

...

...
...
...
2016-03-27 0:40 GMT+01:00 Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com>:

...
...
...
...
Hi, before I continue, i've noticed that the large integer multiply seems

broken on v3 object memory (cog & stack)

...
...
...
...
Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as

it happens BEFORE these changes and is related to primitive 29 rather than to the plugin...

...
...
...
...
Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i]

as: ByteArray.

...
...
...
...
(head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself

is:

...
...
...
...
#(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative

~= bIsNegative.

...
...
...
...
-> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject:

newLargeInteger withValue: magnitude]

...
...
...
...
(which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put:

(self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]);

...
...
...
...
    long32At: oop + self baseHeaderSize + (longIndex << 3) + 4
put: (value >> 32).

...
...
...
...
^value
I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works
for at least 3 reasons:

...
...
...
...

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated

C code is correct.

...
...
...
...

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject:

newLargeInteger withValue: magnitude >> 32].

...
...
...
...
        objectMemory
            storeLong32: 0
            ofObject: newLargeInteger
            withValue: (self cCode: [magnitude] inSmalltalk:
[magnitude bitAnd: 16rFFFFFFFF])

...
...
...
...
Not much different, except that the high 32 bits and low 32 bits are

written in a different order...

...
...
...
...
If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru

simulator or directly in gdb, but I feel like I'm really losing my time :(

...
...
...
...
And I've got a 2nd problem like this one...

-- _,,,^..^,,,_ best, Eliot

David T. Lewis

31 Mar 31 Mar

2:41 a.m.

New subject: Unsigned/signed type declarations for #long32At: (was: Re: vm problem on cog an stack v3)

Changing the subject line to reflect current discussion topic.

On Wed, Mar 30, 2016 at 10:46:21PM +0200, Nicolas Cellier wrote:

...

2016-03-30 13:46 GMT+02:00 Ben Coman btc@openinworld.com:

...
On Wed, Mar 30, 2016 at 2:09 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:

...
2016-03-27 15:39 GMT+02:00 Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com>:

...
...
...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests

fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

...
...
...
Then knowing from which SVN version of COG VM branch the KernelNumber

the tests start failing would be nice.

...
...
...
The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org

or anyother similar solution.

...
...
...
I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to

avoid a segfault related to misalign fetch.

...
...
OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex

<< 3)]

...
...
    ifFalse:
        [self cppIf: VMBIGENDIAN
            ifTrue: [((self long32At: oop + self baseHeaderSize +
(longIndex << 3)) asUnsignedLongLong << 32)

...
...
                + (self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4))]

...
...
            ifFalse: [(self long32At: oop + self baseHeaderSize +
(longIndex << 3))

...
...
                + ((self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

...
...
AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about

signedness...

...
...
I would much much much prefer to call unsignedLong32At because it's

what I mean 9 times out of 10: get the magnitude...

...
Let's add them. It would be great to have them all consistent too. But

at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

...
a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add

signedLongAt:[put:] et al.

...
c) make the simulator's longAt:[put:] long32At:put: signed and then add

unsignedLongAt:[put:] et al

...
Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I can

join your experience in finding that debugging these issues is extremely time consuming.

...
My preference is for b); some uses where the type should be signed will

by located by C compiler warnings; we hope that VM tests will catch the others)

Wouldn't (b) risk occasional cognitive dysfunction switching between the semantic of aa C long being signed and a Slang long being unsigned? (c) seems better long term, but switching the signedness of the an existing simulation method might cause short term instability and need to be tackled in one sitting? So a naive question... Why leave one implicit? Consider d) introducing both signedLongAt:[put:] and unsignedLongAt:[put:] and gradual migration.

cheers -ben
My first thought was as Eliot, use b) because

the simulator is already b)

wild guess is that there are more unsigned than signed usage

But I like Ben arguments better. Non uniform naming has great costs

slow learning ramp for noobs

seed for future bug due to misunderstanding.

For example we already have

long implicitely meaning 32 bits like in signedIntToLong

signedIntFromLong,

long unconditionnally meaning 64 bits like sqLong

long meaning either 32 or 64 bits depending on architecture in generated

C (like in StackInterperter>>putLong:toFile:)

long meaning either 32 or 64 depending on the context (#longAt: is allways 32 bits when sent to ByteArray Bitmap or CAccessor,

but might be 64 bits when sent to simulator argh...)

I remember I had similar problems with word the first time I looked at VMMaker. WordArray being allways 32 bits word, but wordSize being 4 or 8 depending on target architecture. Even now, I'm not sure of which word we are talking of...

Me too. This is endlessly confusing no matter how long I look at it. We need to clarify the terminology and the type declarations to distinguish between references to slots in the object memory, object pointers that coincidentally happen to fit into slots in the object memory, and references to machine data types (C short/int/long/long long).

...

Also b) goes with a blitz strategy:

do a massive semantic change long32 -> unsigned

review the send sites and change a few variant to signed

repair broken C code once image start crashing and tests start failing

repair the simulator which might show different symptoms (see

implementors of #asUnsignedLong) either it works soon, or never...

I've noted only 47 senders of long32At: . Plus the 33 senders of fetchLong32:ofObject: And cannot analyze easily the possible side effects thru Slang type inference...

As my recent bug of LargeInteger division showed: abuse of unsigned lead to expensive bug tracking too. That does a bit restrain my enthusiasm for solution b) - chat ??chaud?? craint l'eau froide ;)

Last, I did not apply b) policy in my VM branch for 32 bits large int. Instead I introduced the unsigned variant gradually... (coward !)

Before we attempt anything, I'd like to remind the short term goals:

avoid uncontrolled sign extension when promoting to larger int in C

avoid undefined behavior, most of which being related to signed arithmetic

make the simulator better match C (or vice versa in the case)

The last goal is far away, there are subtle "differences"... Like in the simulator, the difference of two unsigned might be < 0. In C not, unless we force it (cast, type punning, whatever...).

Last point, gradual introduction and non-regression testing pre-suppose a stable VM. I still experiment random crashes I'd like to track first.

I am still undecided between b) and c). Ben's approach makes the most sense to me on the face of it, but I just did a quick test of the b) approach and the resulting VM works fine, with BitBltTest still green. I did this by modifying the MemoryAccess methods so that #long32At: generates this (which will be inlined in the actual VM code):

/* #define long32At intAt */

static unsigned int long32At(sqInt ptr) { return (((unsigned int *) ((sqMemoryBase) + ptr)))[0]; }

I am probably oversimplifying, and different compilers may treat these things differently, but I see two conclusions that may be drawn from this:

1) Declaring #long32At: as either signed or unsigned is not really important. What is important is the type declaration of the thing that the result gets stored into. We are assuming twos compliment representations, and a function that stores a "signed" value into some context that treats it as unsigned will be an unsigned value, and vice versa.

2) We currently treat the result of #byteAt:, #shortAt:, #intAt:, and #longAt: as signed values. There is no reason to change that convention. But #long32At: (or #long128At: or whatever) could easily be changed at this point to explicitly answer unsigned values. It does not look to me like this would break anything, and it makes sense intuitively because saying "long32At" implies that I want to retrieve a binary bit field of size 32, with no presumption that it should be interpreted either as twos compliment signed or as unsigned.

Dave

Nicolas Cellier

10:03 a.m.

New subject: Unsigned/signed type declarations for #long32At: (was: Re: vm problem on cog an stack v3)

2016-03-31 2:41 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...

Changing the subject line to reflect current discussion topic.

On Wed, Mar 30, 2016 at 10:46:21PM +0200, Nicolas Cellier wrote:

...
2016-03-30 13:46 GMT+02:00 Ben Coman btc@openinworld.com:

...
On Wed, Mar 30, 2016 at 2:09 PM, Eliot Miranda <

eliot.miranda@gmail.com> wrote:

...
...
...
Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com> wrote:

...
...
...
...
2016-03-27 15:39 GMT+02:00 Nicolas Cellier <

nicolas.cellier.aka.nice@gmail.com>:

...
...
...
To be more specific,

The first thing to do would be to confirm that the KernelNumber

tests

...
...
fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

...
...
...
Then knowing from which SVN version of COG VM branch the

KernelNumber

...
...
the tests start failing would be nice.

...
...
...
The bisect job is to:

iterate on version number (whatever strategy, bsearch or

something)

...
...
...
...
...

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber

tests

...
...
...
...
...
Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to

build.squeak.org

...
...
or anyother similar solution.

...
...
...
I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch

to

...
...
avoid a segfault related to misalign fetch.

...
...
OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize +

(longIndex

...
...
<< 3)]

...
...
    ifFalse:
        [self cppIf: VMBIGENDIAN
            ifTrue: [((self long32At: oop + self baseHeaderSize
...
...
(longIndex << 3)) asUnsignedLongLong << 32)

...
...
                + (self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4))]

...
...
            ifFalse: [(self long32At: oop + self baseHeaderSize
...
...
(longIndex << 3))

...
...
                + ((self long32At: oop + self baseHeaderSize +
(longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

...
...
AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about

signedness...

...
...
I would much much much prefer to call unsignedLong32At because it's

what I mean 9 times out of 10: get the magnitude...

...
Let's add them. It would be great to have them all consistent too.
But

...
...
at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

...
a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add

signedLongAt:[put:] et al.

...
c) make the simulator's longAt:[put:] long32At:put: signed and then

add

...
...
unsignedLongAt:[put:] et al

...
Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I

can

...
...
join your experience in finding that debugging these issues is

extremely

...
...
time consuming.

...
My preference is for b); some uses where the type should be signed

will

...
...
by located by C compiler warnings; we hope that VM tests will catch the others)

Wouldn't (b) risk occasional cognitive dysfunction switching between the semantic of aa C long being signed and a Slang long being unsigned? (c) seems better long term, but switching the signedness of the an existing simulation method might cause short term instability and need to be tackled in one sitting? So a naive question... Why leave one implicit? Consider d) introducing both signedLongAt:[put:] and unsignedLongAt:[put:] and gradual migration.

cheers -ben

My first thought was as Eliot, use b) because

the simulator is already b)

wild guess is that there are more unsigned than signed usage

But I like Ben arguments better. Non uniform naming has great costs

slow learning ramp for noobs

seed for future bug due to misunderstanding.

For example we already have

long implicitely meaning 32 bits like in signedIntToLong

signedIntFromLong,

long unconditionnally meaning 64 bits like sqLong

long meaning either 32 or 64 bits depending on architecture in

generated

...
C (like in StackInterperter>>putLong:toFile:)

long meaning either 32 or 64 depending on the context (#longAt: is allways 32 bits when sent to ByteArray Bitmap or

CAccessor,

...
but might be 64 bits when sent to simulator argh...)

I remember I had similar problems with word the first time I looked at

VMMaker.

...
WordArray being allways 32 bits word, but wordSize being 4 or 8 depending on target architecture. Even now, I'm not sure of which word we are talking of...

Me too. This is endlessly confusing no matter how long I look at it. We need to clarify the terminology and the type declarations to distinguish between references to slots in the object memory, object pointers that coincidentally happen to fit into slots in the object memory, and references to machine data types (C short/int/long/long long).

...
Also b) goes with a blitz strategy:

do a massive semantic change long32 -> unsigned

review the send sites and change a few variant to signed

repair broken C code once image start crashing and tests start failing

repair the simulator which might show different symptoms (see

implementors of #asUnsignedLong) either it works soon, or never...

I've noted only 47 senders of long32At: . Plus the 33 senders of fetchLong32:ofObject: And cannot analyze easily the possible side effects thru Slang type inference...

As my recent bug of LargeInteger division showed: abuse of unsigned lead

to

...
expensive bug tracking too. That does a bit restrain my enthusiasm for solution b) - chat ??chaud?? craint l'eau froide ;)

Last, I did not apply b) policy in my VM branch for 32 bits large int. Instead I introduced the unsigned variant gradually... (coward !)

Before we attempt anything, I'd like to remind the short term goals:

avoid uncontrolled sign extension when promoting to larger int in C

avoid undefined behavior, most of which being related to signed

arithmetic

...

make the simulator better match C (or vice versa in the case)

The last goal is far away, there are subtle "differences"... Like in the simulator, the difference of two unsigned might be < 0. In C not, unless we force it (cast, type punning, whatever...).

Last point, gradual introduction and non-regression testing pre-suppose a stable VM. I still experiment random crashes I'd like to track first.

I am still undecided between b) and c). Ben's approach makes the most sense to me on the face of it, but I just did a quick test of the b) approach and the resulting VM works fine, with BitBltTest still green. I did this by modifying the MemoryAccess methods so that #long32At: generates this (which will be inlined in the actual VM code):

/* #define long32At intAt */

static unsigned int long32At(sqInt ptr) { return (((unsigned int *) ((sqMemoryBase) + ptr)))[0]; }

I am probably oversimplifying, and different compilers may treat these things differently, but I see two conclusions that may be drawn from this:

Declaring #long32At: as either signed or unsigned is not really

important. What is important is the type declaration of the thing that the result gets stored into. We are assuming twos compliment representations, and a function that stores a "signed" value into some context that treats it as unsigned will be an unsigned value, and vice versa.

We currently treat the result of #byteAt:, #shortAt:, #intAt:, and

#longAt: as signed values. There is no reason to change that convention. But #long32At: (or #long128At: or whatever) could easily be changed at this point to explicitly answer unsigned values. It does not look to me like this would break anything, and it makes sense intuitively because saying "long32At" implies that I want to retrieve a binary bit field of size 32, with no presumption that it should be interpreted either as twos compliment signed or as unsigned.

Dave

It's bits as long as you don't use it in an expression We have to decide about sign as soon as we write: low | ((unsigned long long) high << 32) Note that I used | instead of + to better show my intention to operate on bits... But when the compiler has to extend 32bits low to 64bits, it has to know how to handle sign bit. In this example we need to explicitely declare things unsigned...

But there is even more trickyness in C: declaring all variable as unsigned still does not protect us from sign extension Here is a small snippet to illustrate it

$ cat >test.c <<END #include <stdio.h> int main() { unsigned short x=0xFF00; unsigned long long y = x << 16; printf("y=%llx\n",y); return 0; } END

$ gcc tests.c $ ./a.exe y=ffffffffff000000

Surprising? No, the unsigned short gets promoted to int (signed int) in the expression x << 16 So we overflow the sign bit (UB) and then promote 32bit signed to 64bits...

David T. Lewis

30 Mar 30 Mar

2:11 p.m.

On Tue, Mar 29, 2016 at 11:09:49PM -0700, Eliot Miranda wrote:

...

Hi Nicolas,

On Tue, Mar 29, 2016 at 2:36 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2016-03-27 15:39 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:

...
To be more specific,

The first thing to do would be to confirm that the KernelNumber tests fail with a squeak.stack.v3 VM compiled with head revision of COG VM branch, whatever OS.

Then knowing from which SVN version of COG VM branch the KernelNumber the tests start failing would be nice.

The bisect job is to:

iterate on version number (whatever strategy, bsearch or something)

checkout VM sources

compile the build.favouriteOS/squeak.stack.v3

run a v3 image with the generated VM and launch KernelNumber tests

Really a job for a jenkins bot, travis bot or whatever... The next good thing would be to give a little love to build.squeak.org or anyother similar solution. I only see red disks on this site...

Follow up: I did bissect manually: 3648 (VMMaker.oscog-eem.nice.1729) OK 3649 (VMMaker.oscog-eem.1740) Not OK

So something went wrong for V3 between these two versions. At the same time, it works for spur.

Spur objects are 8 bytes aligned, v3 objects are 4 bytes aligned... So fetching 64 bits from V3 MUST be decomposed into 2 32 bits fetch to avoid a segfault related to misalign fetch.

OK, let's look how it is performed:

fetchLong64: longIndex ofObject: oop <returnTypeC: #sqLong> ^self cppIf: BytesPerWord = 8 ifTrue: [self long64At: oop + self baseHeaderSize + (longIndex << 3)] ifFalse: [self cppIf: VMBIGENDIAN ifTrue: [((self long32At: oop + self baseHeaderSize + (longIndex << 3)) asUnsignedLongLong << 32) + (self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4))] ifFalse: [(self long32At: oop + self baseHeaderSize + (longIndex << 3)) + ((self long32At: oop + self baseHeaderSize + (longIndex << 3 + 4)) asUnsignedLongLong << 32)]]

AH AH! Operation is: low + (((unsigned) high) << 32)

With low declared as SIGNED (long32At is signed GRRR). What if bit 32 of low is 1? then the compiler will perform:

(unsigned long) low + ...

AND THIS WILL DO A SIGN EXTENSION...

It's not that my code was false... It's just that it uncovered a bug in fetchLong64:ofObject:

That cost me many hours, but that enforce my conviction about signedness... I would much much much prefer to call unsignedLong32At because it's what I mean 9 times out of 10: get the magnitude...

Let's add them. It would be great to have them all consistent too. But at least let's add unsignedLong32At:[put:]. Wait. In the simulator longAt:[put:] et al are unsigned. So we have three choices:

a) live with it b) make longAt:[put:] long32At:put: et al unsigned in C, and add signedLongAt:[put:] et al. c) make the simulator's longAt:[put:] long32At:put: signed and then add unsignedLongAt:[put:] et al

Any others?

I think a) is unwise. The simulator and C should agree. Nicolas, I can join your experience in finding that debugging these issues is extremely time consuming.

My preference is for b); some uses where the type should be signed will by located by C compiler warnings; we hope that VM tests will catch the others)

Can we think about this for a couple of days? I'd like to experiment with it a bit. My first inclination was to choose b), but in re-reading the method comments that I put into MemoryAccess methods, I am thinking that it may be wise to declare things explicitly as in c). I don't know the right answer, but I think it would be good to play with the options a bit before proceeding.

For example, to show existing practice (not necessarily good):

MemoryAccess>>shortAt: oop "Answer the signed short integer value at an object memory location. The result is a signed sqInt value. Negative values will be sign extended, such that if the short integer value is binary 16rFFFF, the result will be 16rFFFFFFFF for a 32-bit object memory, or 16rFFFFFFFFFFFFFFFF for a 64-bit object memory."

"sqInt shortAt(sqInt oop) { return shortAtPointer(pointerForOop(oop)); }"

<inline: true> ^ self shortAtPointer: (self pointerForOop: oop)

Dave

...

...
2016-03-27 0:40 GMT+01:00 Nicolas Cellier <

...
nicolas.cellier.aka.nice@gmail.com>:

...
Hi, before I continue, i've noticed that the large integer multiply seems broken on v3 object memory (cog & stack) Note that this does not happen on Spur.

This is independent of my recent changes of LargeIntegers plugin as it happens BEFORE these changes and is related to primitive 29 rather than to the plugin... Here are the symptoms:

halfPower := 10000. s := 111111111111111. head := s quo: halfPower. tail := s - (head * halfPower). { head as: ByteArray. (1 to: halfPower digitLength) collect: [:i | halfPower digitAt: i] as: ByteArray. (head*halfPower) as: ByteArray. }.

the correct result is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 14 101])

the wrong result I obtained with SVN revision 3651 compiled by myself is: #(#[199 25 70 150 2] #[16 39] #[112 237 78 18 254 61])

The most significant bits (above 32) are wrong... The pattern I obtain is (with most significant bit put back left)

2r00111101 << 8 + 2r11111110 "wrong result" 2r01100101 << 8 + 2r00001110 "Correct result"

I completely fail to infer what's going on from this pattern...

This is on MacOSX clang --version Apple LLVM version 7.3.0 (clang-703.0.29) Target: x86_64-apple-darwin15.4.0

This goes thru primitiveMultiplyLargeIntegers (29) oopResult := self magnitude64BitIntegerFor: result neg: aIsNegative ~= bIsNegative. -> sz > 4 ifTrue: [objectMemory storeLong64: 0 ofObject: newLargeInteger withValue: magnitude] (which I changed recently)

then: storeLong64: longIndex ofObject: oop withValue: value <var: #value type: #sqLong> self flag: #endianness. self long32At: oop + self baseHeaderSize + (longIndex << 3) put: (self cCode: [value] inSmalltalk: [value bitAnd: 16rFFFFFFFF]); long32At: oop + self baseHeaderSize + (longIndex << 3) + 4 put: (value >> 32). ^value

I don't see anything wrong with this code... Well, using a shift on signed value is not that good, but it works for at least 3 reasons:

we throw the signBit extension away

slang inlining misses the signedness difference, and the generated C

code is correct.

Anyway, in our case, the sign bit was 0...

Previous implementation in magnitude64BitIntegerFor:neg: was: sz > 4 ifTrue: [objectMemory storeLong32: 1 ofObject: newLargeInteger withValue: magnitude >> 32]. objectMemory storeLong32: 0 ofObject: newLargeInteger withValue: (self cCode: [magnitude] inSmalltalk: [magnitude bitAnd: 16rFFFFFFFF])

Not much different, except that the high 32 bits and low 32 bits are written in a different order...

If I had a server I'd like to bisect

from which version does this happens

for which OS

for which compiler

Without such information, I think I'll have to debug it either thru simulator or directly in gdb, but I feel like I'm really losing my time :(

And I've got a 2nd problem like this one...

-- _,,,^..^,,,_ best, Eliot

2969

Age (days ago)

2974

Last active (days ago)

vm-dev@lists.squeakfoundation.org

9 comments

4 participants

tags (0)

participants (4)

Ben Coman
David T. Lewis
Eliot Miranda
Nicolas Cellier