Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
1. Print it: "Character value: 16r8000" 2. Inspect the result by evaluating the character literal or send #asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 à 17:57, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel marcel.taeumel@hpi.de a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel Am 09.03.2022 08:34:11 schrieb Nicolas Cellier nicolas.cellier.aka.nice@gmail.com: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too
It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel <marcel.taeumel@hpi.de [mailto:marcel.taeumel@hpi.de]> a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis <lewis@mail.msen.com [mailto:lewis@mail.msen.com]>:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Oups forgot to respond to squeak-dev too...
in #interpretNextSistaV1InstructionFor: we se that extB is interpreted as signed char
extB := (extB = 0 and: [extByte > 127]) ifTrue: [extByte - 256] ifFalse: [(extB bitShift: 8) + extByte]
Then in interpretNext2ByteSistaV1Instruction: bytecode for: client extA: extA extB: extB startPC: startPC
^client pushSpecialConstant: (Character value: (extB bitShift: 8) + byte)
In our case, extA=0, extB=-128, bytecode=233
Le mer. 9 mars 2022 à 10:02, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
IOW, the Character value being unsigned, it would be preferable to use extend A rather than extend B in #genPushCharacter: My understanding is that this would require a VM change too...
Le mer. 9 mars 2022 à 10:08, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Oups forgot to respond to squeak-dev too...
in #interpretNextSistaV1InstructionFor: we se that extB is interpreted as signed char
extB := (extB = 0 and: [extByte > 127]) ifTrue: [extByte - 256] ifFalse: [(extB bitShift: 8) + extByte]
Then in interpretNext2ByteSistaV1Instruction: bytecode for: client extA: extA extB: extB startPC: startPC
^client pushSpecialConstant: (Character value: (extB bitShift: 8) + byte)
In our case, extA=0, extB=-128, bytecode=233
Le mer. 9 mars 2022 à 10:02, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
VM fix (workaround) proposed in VMMakeInbox/VMMaker.oscog-nice.3174
Le mer. 9 mars 2022 à 10:25, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
IOW, the Character value being unsigned, it would be preferable to use extend A rather than extend B in #genPushCharacter: My understanding is that this would require a VM change too...
Le mer. 9 mars 2022 à 10:08, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Oups forgot to respond to squeak-dev too...
in #interpretNextSistaV1InstructionFor: we se that extB is interpreted as signed char
extB := (extB = 0 and: [extByte > 127]) ifTrue: [extByte - 256] ifFalse: [(extB bitShift: 8) + extByte]
Then in interpretNext2ByteSistaV1Instruction: bytecode for: client extA: extA extB: extB startPC: startPC
^client pushSpecialConstant: (Character value: (extB bitShift: 8) + byte)
In our case, extA=0, extB=-128, bytecode=233
Le mer. 9 mars 2022 à 10:02, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> a écrit :
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1
Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
> > Hi Eliot, hi all -- > > I think we have an sign-bit bug for character literals with code
points >
> 16r7FFF. > > Steps to reproduce: > > 1. Print it: "Character value: 16r8000" > 2. Inspect the result by evaluating the character literal or send > #asInteger to it. It will most likely not render in a standard
Squeak
and
> show up like "$? asInteger". > > In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
> In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
> > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1.
In
64-bit,
> this means a negative number. Not sure about bits 30 and 31 here. > > Is there a bug in the upper tag bits of immediate characters? > Is this related to the 2-byte or 3-byte byte codes in SistaV1? > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar.
Mine was
0
> in this experiment.) > > VM: 202112201228 (VMMaker.oscog-eem.3116) > > Best, > Marcel >
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
> > Hi Eliot, hi all -- > > I think we have an sign-bit bug for character literals with code
points >
> 16r7FFF. > > Steps to reproduce: > > 1. Print it: "Character value: 16r8000" > 2. Inspect the result by evaluating the character literal or send > #asInteger to it. It will most likely not render in a standard
Squeak and
> show up like "$? asInteger". > > In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
> In a 64-bit VM, I will get the (negative) integer value '-16r8000'. > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
> this means a negative number. Not sure about bits 30 and 31 here. > > Is there a bug in the upper tag bits of immediate characters? > Is this related to the 2-byte or 3-byte byte codes in SistaV1? > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
> in this experiment.) > > VM: 202112201228 (VMMaker.oscog-eem.3116) > > Best, > Marcel >
Hi Nicolas --
Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
For what it's worth, one can always replace the character-literal syntax with string access:
$x. 'x' first.
Or store the code point if the optical appearance is not relevant:
Character value: 16r78.
Best, Marcel Am 09.03.2022 10:02:46 schrieb Nicolas Cellier nicolas.cellier.aka.nice@gmail.com: Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel <marcel.taeumel@hpi.de [mailto:marcel.taeumel@hpi.de]> a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com [mailto:nicolas.cellier.aka.nice@gmail.com]>: Ah OK, I see it on macos too
It remains to determine which operation exactly is involved...
The TextMorph holding the printed result is correct - a WideString, whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals,
not character objects/instances. You have to evaluate code on that
character literal.
Maybe this picture helps:
Best,
Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel,
which OS ?
I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
5.20211023.2003
Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4)
platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters?
Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best,
Marcel
Ah OK, I see it on macos too
It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel <marcel.taeumel@hpi.de [mailto:marcel.taeumel@hpi.de]> a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis <lewis@mail.msen.com [mailto:lewis@mail.msen.com]>:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Or just restrict EncoderForSistaV1>>#genPushCharacter:
...snip... (code < 0 or: [code > 16r7FFF]) ifTrue: [^self outOfRangeError: 'character' index: code range: 0 to: 16r7FFF]. ...snip...
Le mer. 9 mars 2022 à 14:16, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
For what it's worth, one can always replace the character-literal syntax with string access:
$x. 'x' first.
Or store the code point if the optical appearance is not relevant:
Character value: 16r78.
Best, Marcel
Am 09.03.2022 10:02:46 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I
suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too
It remains to determine which operation exactly is involved...
The TextMorph holding the printed result is correct - a WideString, whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that
character literal.
Maybe this picture helps:
Best,
Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel,
which OS ?
I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
5.20211023.2003
Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4)
platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters?
Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best,
Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Seeing this, I believe that bit was used for something else in sista and we agreed with Eliot 32k literals was enough? I cannot remember. I think the bit meant Cogit should not generate profiling counter for the method or something like that.
On Wed, Mar 9, 2022 at 3:41 PM Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
Or just restrict EncoderForSistaV1>>#genPushCharacter:
...snip... (code < 0 or: [code > 16r7FFF]) ifTrue: [^self outOfRangeError: 'character' index: code range: 0 to: 16r7FFF]. ...snip...
Le mer. 9 mars 2022 à 14:16, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
For what it's worth, one can always replace the character-literal syntax with string access:
$x. 'x' first.
Or store the code point if the optical appearance is not relevant:
Character value: 16r78.
Best, Marcel
Am 09.03.2022 10:02:46 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I
suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass:
EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1
Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1.
In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar.
Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString,
whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code
on
that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too
It remains to determine which operation exactly is involved...
The TextMorph holding the printed result is correct - a WideString, whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that
character literal.
Maybe this picture helps:
Best,
Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel,
which OS ?
I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
5.20211023.2003
Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4)
platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak
and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value
16r3FFF8000.
In a 64-bit VM, I will get the (negative) integer value
'-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters?
Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best,
Marcel
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
I cannot reproduce on Linux 64 bit either: (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com:
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
Hi Marcel, which OS ? I cannot reproduce on macos 64,
Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] 5.20211023.2003 Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
Apple
LLVM 10.0.1 (clang-1001.0.46.4) platform sources revision VM: 202110232003
Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a ??crit :
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code
points >
16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send
#asInteger to it. It will most likely not render in a standard
Squeak and
show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
64-bit,
this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was 0
in this experiment.)
VM: 202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
On Mar 9, 2022, at 11:24 AM, Clément Béra bera.clement@gmail.com wrote:
Seeing this, I believe that bit was used for something else in sista and we agreed with Eliot 32k literals was enough? I cannot remember. I think the bit meant Cogit should not generate profiling counter for the method or something like that.
Exactly
On Wed, Mar 9, 2022 at 3:41 PM Nicolas Cellier nicolas.cellier.aka.nice@gmail.com wrote:
Or just restrict EncoderForSistaV1>>#genPushCharacter:
...snip... (code < 0 or: [code > 16r7FFF]) ifTrue: [^self outOfRangeError: 'character' index: code range: 0 to: 16r7FFF]. ...snip...
Le mer. 9 mars 2022 à 14:16, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
Thanks! Also for the proposed workaround in VMMaker.oscog-nice.3174.
For what it's worth, one can always replace the character-literal syntax with string access:
$x. 'x' first.
Or store the code point if the optical appearance is not relevant:
Character value: 16r78.
Best, Marcel
Am 09.03.2022 10:02:46 schrieb Nicolas Cellier nicolas.cellier.aka.nice@gmail.com:
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>: Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
> I cannot reproduce on Linux 64 bit either: > (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character
literals,
not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote: > > Hi Marcel, > which OS ? > I cannot reproduce on macos 64, > > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] > 5.20211023.2003 > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple > LLVM 10.0.1 (clang-1001.0.46.4) > platform sources revision VM: 202110232003 > > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a > ??crit : > > > > > Hi Eliot, hi all -- > > > > I think we have an sign-bit bug for character literals with code points > > > 16r7FFF. > > > > Steps to reproduce: > > > > 1. Print it: "Character value: 16r8000" > > 2. Inspect the result by evaluating the character literal or send > > #asInteger to it. It will most likely not render in a standard
Squeak
and > > show up like "$? asInteger". > > > > In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'. > > > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, > > this means a negative number. Not sure about bits 30 and 31 here. > > > > Is there a bug in the upper tag bits of immediate characters? > > Is this related to the 2-byte or 3-byte byte codes in SistaV1? > > > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine
was
0 > > in this experiment.) > > > > VM: 202112201228 (VMMaker.oscog-eem.3116) > > > > Best, > > Marcel > >
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a écrit :
Hi Dave, hi Nicolas --
I am working in Windows 10.
> I cannot reproduce on Linux 64 bit either: > (Character value: 16r8000) asInteger hex ==> '16r8000'
That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal.
Maybe this picture helps:
Best, Marcel
Am 08.03.2022 18:56:09 schrieb David T. Lewis :
I cannot reproduce on Linux 64 bit either:
(Character value: 16r8000) asInteger hex ==> '16r8000'
Dave
On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote: > > Hi Marcel, > which OS ? > I cannot reproduce on macos 64, > > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] > 5.20211023.2003 > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple > LLVM 10.0.1 (clang-1001.0.46.4) > platform sources revision VM: 202110232003 > > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a > ??crit : > > > > > Hi Eliot, hi all -- > > > > I think we have an sign-bit bug for character literals with code points > > > 16r7FFF. > > > > Steps to reproduce: > > > > 1. Print it: "Character value: 16r8000" > > 2. Inspect the result by evaluating the character literal or send > > #asInteger to it. It will most likely not render in a standard Squeak and > > show up like "$? asInteger". > > > > In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'. > > > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, > > this means a negative number. Not sure about bits 30 and 31 here. > > > > Is there a bug in the upper tag bits of immediate characters? > > Is this related to the 2-byte or 3-byte byte codes in SistaV1? > > > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 > > in this experiment.) > > > > VM: 202112201228 (VMMaker.oscog-eem.3116) > > > > Best, > > Marcel > >
Hi Marcel, yes, I agree, the bug is in bytecode encoding/decoding of immediate Character value, I stepped into (Compiler evaluate: (String with: $$ with: (Character value: 16r8000))), and if we step into executeMethod, we can inspect what is going on.
Le mer. 9 mars 2022 à 08:39, Marcel Taeumel marcel.taeumel@hpi.de a écrit :
Hi Nicolas --
There is a bug in the EncoderForSistaV1. The behavior is okay for EncoderForV3PlusClosures. We can discuss this on squeak-dev now, I suppose.
CompiledCode preferredBytecodeSetEncoderClass: EncoderForSistaV1. CompiledCode preferredBytecodeSetEncoderClass: EncoderForV3PlusClosures.
If you do send #halt instead of #asInteger, you get another interesting debugger when trying to start debugging:
Best, Marcel
Am 09.03.2022 08:34:11 schrieb Nicolas Cellier nicolas.cellier.aka.nice@gmail.com:
Ah OK, I see it on macos too
It remains to determine which operation exactly is involved...
The TextMorph holding the printed result is correct - a WideString, whose
last Character is (Character value: 32768).
Le mer. 9 mars 2022 à 08:08, Marcel Taeumel a
écrit :
>
> Hi Dave, hi Nicolas --
>
> I am working in Windows 10.
>
> > I cannot reproduce on Linux 64 bit either:
> > (Character value: 16r8000) asInteger hex ==> '16r8000'
>
> That's not how you would reproduce it. The bug affects character literals,
> not character objects/instances. You have to evaluate code on that
> character literal.
>
> Maybe this picture helps:
>
>
>
> Best,
> Marcel
>
> Am 08.03.2022 18:56:09 schrieb David T. Lewis :
>
> I cannot reproduce on Linux 64 bit either:
>
> (Character value: 16r8000) asInteger hex ==> '16r8000'
>
> Dave
>
>
> On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote:
> >
> > Hi Marcel,
> > which OS ?
> > I cannot reproduce on macos 64,
> >
> > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172]
> > 5.20211023.2003
> > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible
> Apple
> > LLVM 10.0.1 (clang-1001.0.46.4)
> > platform sources revision VM: 202110232003
> >
> > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a
> > ??crit :
> >
> > >
> > > Hi Eliot, hi all --
> > >
> > > I think we have an sign-bit bug for character literals with code
> points >
> > > 16r7FFF.
> > >
> > > Steps to reproduce:
> > >
> > > 1. Print it: "Character value: 16r8000"
> > > 2. Inspect the result by evaluating the character literal or send
> > > #asInteger to it. It will most likely not render in a standard Squeak
> and
> > > show up like "$? asInteger".
> > >
> > > In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000.
> > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
> > >
> > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In
> 64-bit,
> > > this means a negative number. Not sure about bits 30 and 31 here.
> > >
> > > Is there a bug in the upper tag bits of immediate characters?
> > > Is this related to the 2-byte or 3-byte byte codes in SistaV1?
> > >
> > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was
> 0
> > > in this experiment.)
> > >
> > > VM: 202112201228 (VMMaker.oscog-eem.3116)
> > >
> > > Best,
> > > Marcel
> > >
>
>
Ah OK, I see it on macos too It remains to determine which operation exactly is involved... The TextMorph holding the printed result is correct - a WideString, whose last Character is (Character value: 32768).
> Le mer. 9 mars 2022 à 08:08, Marcel Taeumel marcel.taeumel@hpi.de a écrit : > > > > Hi Dave, hi Nicolas -- > > I am working in Windows 10. > > > I cannot reproduce on Linux 64 bit either: > > (Character value: 16r8000) asInteger hex ==> '16r8000' > > That's not how you would reproduce it. The bug affects character literals, not character objects/instances. You have to evaluate code on that character literal. > > Maybe this picture helps: > > > > Best, > Marcel > > >> >> >> Am 08.03.2022 18:56:09 schrieb David T. Lewis lewis@mail.msen.com: >> >> >> I cannot reproduce on Linux 64 bit either: >> >> (Character value: 16r8000) asInteger hex ==> '16r8000' >> >> Dave >> >> >> On Tue, Mar 08, 2022 at 06:45:23PM +0100, Nicolas Cellier wrote: >> > >> > Hi Marcel, >> > which OS ? >> > I cannot reproduce on macos 64, >> > >> > Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-eem.3172] >> > 5.20211023.2003 >> > Mac OS X built on Mar 6 2022 15:31:16 CET Compiler: 4.2.1 Compatible Apple >> > LLVM 10.0.1 (clang-1001.0.46.4) >> > platform sources revision VM: 202110232003 >> > >> > Le mar. 8 mars 2022 ?? 17:57, Marcel Taeumel a >> > ??crit : >> > >> > > >> > > Hi Eliot, hi all -- >> > > >> > > I think we have an sign-bit bug for character literals with code points > >> > > 16r7FFF. >> > > >> > > Steps to reproduce: >> > > >> > > 1. Print it: "Character value: 16r8000" >> > > 2. Inspect the result by evaluating the character literal or send >> > > #asInteger to it. It will most likely not render in a standard Squeak and >> > > show up like "$? asInteger". >> > > >> > > In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. >> > > In a 64-bit VM, I will get the (negative) integer value '-16r8000'. >> > > >> > > Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, >> > > this means a negative number. Not sure about bits 30 and 31 here. >> > > >> > > Is there a bug in the upper tag bits of immediate characters? >> > > Is this related to the 2-byte or 3-byte byte codes in SistaV1? >> > > >> > > Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 >> > > in this experiment.) >> > > >> > > VM: 202112201228 (VMMaker.oscog-eem.3116) >> > > >> > > Best, >> > > Marcel >> > > >>
-- Clément Béra https://clementbera.github.io/ https://clementbera.wordpress.com/
On Tue, Mar 08, 2022 at 05:57:32PM +0100, Marcel Taeumel wrote:
Hi Eliot, hi all --
I think we have an sign-bit bug for character literals with code points > 16r7FFF.
Steps to reproduce:
- Print it: "Character value: 16r8000"
- Inspect the result by evaluating the character literal or send #asInteger to it. It will most likely not render in a standard Squeak and show up like "$? asInteger".
In a 32-bit VM, I will get the (positive) integer value 16r3FFF8000. In a 64-bit VM, I will get the (negative) integer value '-16r8000'.
Somehow, starting at bit 0, the bits 16 to 29 flip from 0 to 1. In 64-bit, this means a negative number. Not sure about bits 30 and 31 here.
Is there a bug in the upper tag bits of immediate characters? Is this related to the 2-byte or 3-byte byte codes in SistaV1?
Works fine up to 16r7FFF. (This is unrelated to #leadingChar. Mine was 0 in this experiment.)
VM:??202112201228 (VMMaker.oscog-eem.3116)
Best, Marcel
Hi Marcel,
This integer type declaration stuff is enough to give anybody a headache, so here is a tip to make it slightly less obscure without leaving the comfort of the Squeak image.
First load package TwosComplement, either from SqueakMap or from http://www.squeaksource.com/TwosComplement.
Then take a look at the two suspicious integer values that you get from your 32-bit VM and 64-bit VM, rendering them as 32-bit twos complement (the common case for C int on most platforms):
{ 16r3FFF8000 asRegister: 32 . -16r8000 asRegister: 32} inspect.
or the simpler version (since 32 bits is the default):
{ 16r3FFF8000 asRegister . -16r8000 asRegister} inspect.
This shows the low order 16 bits (the actual character value) as valid in both cases, and the high order 16 bits as garbage related to integer type declaration and/or sign extension in the VM.
Very likely this will turn out to be an issue in primitive 171, InterpreterPrimitives>>primitiveImmediateAsInteger. And it may be an issue related to integer data types in Windows versus unix-based systems.
The issue will not be related to the upper tag bits of immediate characters, and it will not be related to the 2-byte or 3-byte byte codes in SistaV1. It's just some sort of type declaration issue in the VM code, that's all.
Dave
vm-dev@lists.squeakfoundation.org