A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
Hmm... I would prefer to just patch Generator instead of Stream:
s := self environment allClasses readStream. [s reset; take: 3000] bench
AFTER: '5,700 per second. 175 microseconds per run. 7.59848 % GC time.' BEFORE: '143,000 per second. 6.99 microseconds per run. 33.90644 % GC time.'
Best, Marcel Am 30.06.2021 01:06:49 schrieb commits@source.squeak.org commits@source.squeak.org: A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
Hi Marcel,
Hmm... I would prefer to just patch Generator instead of Stream:
Not convinced. :-) As far as I understand the general protocol of Streams, #next is expected to answer either the next element, if any, or otherwise, nil. With this knowledge, Stream >> #take: is just erroneous because it will not honor the situation in which the receiver has fewer elements than requested. IMO correctness is more important than optimization.
Nevertheless, I see that for ReadStreams, we can indeed optimize #take:. We just need to map ReadStream >> #take: to #next:, which - unlike its superclass - will not answer more elements than available. Then it should be at least as fast as in the current Trunk.
Do you think this would be the right approach? Shall I upload a new inbox version for this? :-)
Best, Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 11:06:21 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz
Hmm... I would prefer to just patch Generator instead of Stream:
s := self environment allClasses readStream. [s reset; take: 3000] bench
AFTER: '5,700 per second. 175 microseconds per run. 7.59848 % GC time.' BEFORE: '143,000 per second. 6.99 microseconds per run. 33.90644 % GC time.'
Best, Marcel
Am 30.06.2021 01:06:49 schrieb commits@source.squeak.org commits@source.squeak.org:
A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
Hi Christoph,
you found a bug in Generator, not ReadStream nor WriteStream. Both work as expected. ;-) That's why the simplest fix would be in Generator.
Yes, we can change the default behavior as you proposed and overwrite it in PositionableStream for better performance. Please double-check what the correct subclass of Stream should be. I think that ReadStream is too specific.
Best, Marcel Am 30.06.2021 14:44:41 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi Marcel,
Hmm... I would prefer to just patch Generator instead of Stream:
Not convinced. :-) As far as I understand the general protocol of Streams, #next is expected to answer either the next element, if any, or otherwise, nil. With this knowledge, Stream >> #take: is just erroneous because it will not honor the situation in which the receiver has fewer elements than requested. IMO correctness is more important than optimization.
Nevertheless, I see that for ReadStreams, we can indeed optimize #take:. We just need to map ReadStream >> #take: to #next:, which - unlike its superclass - will not answer more elements than available. Then it should be at least as fast as in the current Trunk.
Do you think this would be the right approach? Shall I upload a new inbox version for this? :-)
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 11:06:21 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz Hmm... I would prefer to just patch Generator instead of Stream:
s := self environment allClasses readStream. [s reset; take: 3000] bench
AFTER: '5,700 per second. 175 microseconds per run. 7.59848 % GC time.' BEFORE: '143,000 per second. 6.99 microseconds per run. 33.90644 % GC time.'
Best, Marcel Am 30.06.2021 01:06:49 schrieb commits@source.squeak.org commits@source.squeak.org: A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
Hi Marcel,
the bug manifests itself in Generator, yes, but for the reasons I tried to explain before, I think that the defect is indeed located in Stream.
Please double-check what the correct subclass of Stream should be. I think that ReadStream is too specific.
http://www.hpi.de/ The problem with PositionableStream is that its implementation of #next: still answers a collection with exactly n elements some of which might be excess values (nil). ReadStream is the first class in the inheritance line in which #next: automatically stops the enumeration at the end of the receiver and thus behaves like #take:. Given a hypothetical alternative subclass of PositionableStream which is neither ReadStream nor WriteStream (i.e., let's imagine a decorator class for another stream), it might be affected by the discrepancy between #next: and take: as well. For this reason, I think we should only apply the optimization of #take: to ReadStream.
What do you think? :-)
Best, Christoph ________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 14:53:24 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz
Hi Christoph,
you found a bug in Generator, not ReadStream nor WriteStream. Both work as expected. ;-) That's why the simplest fix would be in Generator.
Yes, we can change the default behavior as you proposed and overwrite it in PositionableStream for better performance. Please double-check what the correct subclass of Stream should be. I think that ReadStream is too specific.
Best, Marcel
Am 30.06.2021 14:44:41 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Hi Marcel,
Hmm... I would prefer to just patch Generator instead of Stream:
Not convinced. :-) As far as I understand the general protocol of Streams, #next is expected to answer either the next element, if any, or otherwise, nil. With this knowledge, Stream >> #take: is just erroneous because it will not honor the situation in which the receiver has fewer elements than requested. IMO correctness is more important than optimization.
Nevertheless, I see that for ReadStreams, we can indeed optimize #take:. We just need to map ReadStream >> #take: to #next:, which - unlike its superclass - will not answer more elements than available. Then it should be at least as fast as in the current Trunk.
Do you think this would be the right approach? Shall I upload a new inbox version for this? :-)
Best, Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 11:06:21 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz
Hmm... I would prefer to just patch Generator instead of Stream:
s := self environment allClasses readStream. [s reset; take: 3000] bench
AFTER: '5,700 per second. 175 microseconds per run. 7.59848 % GC time.' BEFORE: '143,000 per second. 6.99 microseconds per run. 33.90644 % GC time.'
Best, Marcel
Am 30.06.2021 01:06:49 schrieb commits@source.squeak.org commits@source.squeak.org:
A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
Hi Christoph,
thanks for checking. Sounds good. :-) Both your proposed bugfix and the overall refactoring to preserve the performance in read-streams.
Best, Marcel Am 30.06.2021 16:00:52 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi Marcel,
the bug manifests itself in Generator, yes, but for the reasons I tried to explain before, I think that the defect is indeed located in Stream.
Please double-check what the correct subclass of Stream should be. I think that ReadStream is too specific.
[http://www.hpi.de/] The problem with PositionableStream is that its implementation of #next: still answers a collection with exactly n elements some of which might be excess values (nil). ReadStream is the first class in the inheritance line in which #next: automatically stops the enumeration at the end of the receiver and thus behaves like #take:. Given a hypothetical alternative subclass of PositionableStream which is neither ReadStream nor WriteStream (i.e., let's imagine a decorator class for another stream), it might be affected by the discrepancy between #next: and take: as well. For this reason, I think we should only apply the optimization of #take: to ReadStream.
What do you think? :-)
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 14:53:24 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz Hi Christoph,
you found a bug in Generator, not ReadStream nor WriteStream. Both work as expected. ;-) That's why the simplest fix would be in Generator.
Yes, we can change the default behavior as you proposed and overwrite it in PositionableStream for better performance. Please double-check what the correct subclass of Stream should be. I think that ReadStream is too specific.
Best, Marcel Am 30.06.2021 14:44:41 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de: Hi Marcel,
Hmm... I would prefer to just patch Generator instead of Stream:
Not convinced. :-) As far as I understand the general protocol of Streams, #next is expected to answer either the next element, if any, or otherwise, nil. With this knowledge, Stream >> #take: is just erroneous because it will not honor the situation in which the receiver has fewer elements than requested. IMO correctness is more important than optimization.
Nevertheless, I see that for ReadStreams, we can indeed optimize #take:. We just need to map ReadStream >> #take: to #next:, which - unlike its superclass - will not answer more elements than available. Then it should be at least as fast as in the current Trunk.
Do you think this would be the right approach? Shall I upload a new inbox version for this? :-)
Best, Christoph Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Taeumel, Marcel Gesendet: Mittwoch, 30. Juni 2021 11:06:21 An: squeak-dev Betreff: Re: [squeak-dev] The Inbox: Collections-ct.950.mcz Hmm... I would prefer to just patch Generator instead of Stream:
s := self environment allClasses readStream. [s reset; take: 3000] bench
AFTER: '5,700 per second. 175 microseconds per run. 7.59848 % GC time.' BEFORE: '143,000 per second. 6.99 microseconds per run. 33.90644 % GC time.'
Best, Marcel Am 30.06.2021 01:06:49 schrieb commits@source.squeak.org commits@source.squeak.org: A new version of Collections was added to project The Inbox: http://source.squeak.org/inbox/Collections-ct.950.mcz
==================== Summary ====================
Name: Collections-ct.950 Author: ct Time: 30 June 2021, 1:06:39.38741 am UUID: 58b871bc-ace9-184b-8e98-305453096350 Ancestors: Collections-mt.945
Fixes Stream >> #take:. Unlike in #any:, we must not return nil values from #next here but the enumeration earlier. See CollectionsTests-ct.361.
=============== Diff against Collections-mt.945 ===============
Item was changed: ----- Method: Stream>>take: (in category 'collections - accessing') ----- take: maxNumberOfElements "See Collection protocol." + + | aCollection | + aCollection := OrderedCollection new. + maxNumberOfElements timesRepeat: [ + self atEnd ifTrue: [^ aCollection]. + aCollection addLast: self next]. + ^ aCollection! - - ^ self any: maxNumberOfElements!
squeak-dev@lists.squeakfoundation.org