Hi everyone,
while working on parsing network data I tried to use asOctetString and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.
Bests, Patrick
testAsOctetStringFromWideString
| rawStringOctet wideStringAsOctet wideString | rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02 starts here" 0 1 250 2]. wideString := 'grüße' , (String value: 16r1FA02). wideStringAsOctet := wideString asOctetString asByteArray. self assert: rawStringOctet equals: wideStringAsOctet.
Hi Patrick,
The name of the method is misleading. The intention was not to change the encoding of the receiver nor to filter out out-of-range bytes but to create a ByteString from a WideString when it only contains byte characters. So, the method will return a string equal to the receiver. The returned string will be a ByteString if an only if #isOctetString returns true. I don't think the conversion in your example would make much sense, because it's not reversible: there's no way to recreate the string from a ByteArray.
Levente
On Fri, 14 Jun 2019, patrick.rein@hpi.uni-potsdam.de wrote:
Hi everyone,
while working on parsing network data I tried to use asOctetString and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.
Bests, Patrick
testAsOctetStringFromWideString
| rawStringOctet wideStringAsOctet wideString | rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02 starts here" 0 1 250 2]. wideString := 'grüße' , (String value: 16r1FA02). wideStringAsOctet := wideString asOctetString asByteArray. self assert: rawStringOctet equals: wideStringAsOctet.
Hi Levente,
thanks for clarifying this :) I will add a comment to the method to document the intent and a corresponding test case.
Bests Patrick
Am 17. Juni 2019, 17:35, um 17:35, Levente Uzonyi leves@caesar.elte.hu schrieb:
Hi Patrick,
The name of the method is misleading. The intention was not to change the encoding of the receiver nor to filter out out-of-range bytes but to create a ByteString from a WideString when it only contains byte characters. So, the method will return a string equal to the receiver. The returned string will be a ByteString if an only if #isOctetString returns true. I don't think the conversion in your example would make much sense, because it's not reversible: there's no way to recreate the string from a ByteArray.
Levente
On Fri, 14 Jun 2019, patrick.rein@hpi.uni-potsdam.de wrote:
Hi everyone,
while working on parsing network data I tried to use asOctetString
and noticed that it did not yield a ByteString when called on a WideString. Is this the intendended behavior? I would have expected a behavior similar to the one I documented in the test case below. If not I think that the problem simply originates from using #at: which is overridden by WideString.
Bests, Patrick
testAsOctetStringFromWideString
| rawStringOctet wideStringAsOctet wideString | rawStringOctet := #[103 114 252 223 101 "The character 16r1fA02
starts here" 0 1 250 2].
wideString := 'grüße' , (String value: 16r1FA02). wideStringAsOctet := wideString asOctetString asByteArray. self assert: rawStringOctet equals: wideStringAsOctet.
squeak-dev@lists.squeakfoundation.org