As reported by Marcel on Squeak-dev, this snippet:
Compiler evaluate: (String with: $$ with: (Character value: 16r8000)).
will return a Character with a negative value (-32768).
This is because character literal in the range 16r100-16rFFFF are encoded with an extended B bytecode, and the extended B is interpreted as signed value.
I suggested - a workaround : using `(extB bitAnd: 16rFF) << 8` to reconstruct an unsigned Character value - a fix: using extA instead of extB so as to have it interpreted unsigned
See https://source.squeak.org/VMMakerInbox/VMMaker.oscog-nice.3174.diff for the former solution.
The later solution require more changes and a careful recompilation of CompiledMethods at image side.
Note that restricting `EncoderForSistaV1>>genPushCharacter:` is a safe image-side workaround until the VM is fixed.
I think both image and VM have to be fixed. This was a slip by me; apologies. Clearly, the bytecode is wrongly specified. Instead of taking the signed extension B it should take the unsigned extension A. This was a copy-paste error copying the push integer bytecode. So instead of ```233 11101001 i i i i i i i i Push Character #iiiiiiii (+ Extend B * 256)``` the bytecode should be ```233 11101001 i i i i i i i i Push Character #iiiiiiii (+ Extend A * 256)```. I will make both of these changes.
I think it's safe to just make the change. The extended form isn't in use in a VMMaker image (trunk + a lot of other stuff) so it won't affect most people. Marcel will be affected but he can update to a new VM (right Marcel)?
Here's code that checks: self sn browseAllSelect: [:m| m scanFor: [:a :b :c| a = 16rE1 and: [c = 16rE9]]]
Note that the change is unfortunate. Because the existing VM doesn't clear extA afterwards setting extA can have disastrous effects in the old VM. e.g. if followed by a block creation bytecode then that bytecode will consume the extension not cleared by the push character bytecode. But as I said it is unlikely to bite anyone. Fingers crossed.
Hacking the existing spec to use a sign extension is less harmful in the short term, but more harmful in the long term.
I shall make the new VM take either extA or extB and clear them both, at least temporarily. We can then change the image code to use extA, and hope that no-one is affected, because Marcel, you can upgrade to a new VM and Nicolas, your fix to avoid the use of the bytecode will save us for now.
Closed #618 via d5835e5ea2d427647e6b6bef7d11d6eba4fe7ea8.
🎉
vm-dev@lists.squeakfoundation.org