Speedcenter setup

List overview All Threads
Download

newer

older

VM Maker:...

[OpenSmalltalk/opensmalltalk-vm]...

Tim Felgentreff

27 Jul 2016 27 Jul '16

8:56 a.m.

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

Show replies by date

Ben Coman

27 Jul 27 Jul

10:08 a.m.

On Wed, Jul 27, 2016 at 2:56 PM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

Are they all a "less is better" comparison ?

cheers -ben

Tim Felgentreff

10:28 a.m.

Yes, they are. The website isn't the best, but you can click your way through to see the actual numbers and stdev, too.

Best, Tim

On 27 July 2016 at 10:08, Ben Coman btc@openinworld.com wrote:

...

On Wed, Jul 27, 2016 at 2:56 PM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...
Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

Are they all a "less is better" comparison ?

cheers -ben

Eliot Miranda

7 Oct 7 Oct

6:24 p.m.

Hi Tim (F),

find attached a minor cleanup of SMark. I'd like to have remission to write to the repository if that's ok with you; I don't have it right now.

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid& env=2&revs=50&equid=on

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

-- _,,,^..^,,,_ best, Eliot

Tim Felgentreff

10 Oct 10 Oct

3:58 p.m.

Hi Eliot,

sure, I can add you, do you have an account at http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource?

Best, Tim

On Fri, 7 Oct 2016 at 18:24 Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Tim (F),
find attached a minor cleanup of SMark.  I'd like to have remission to
write to the repository if that's ok with you; I don't have it right now.

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

-- _,,,^..^,,,_ best, Eliot

Eliot Miranda

6:46 p.m.

Hi Tim,

On Mon, Oct 10, 2016 at 6:58 AM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Hi Eliot,

sure, I can add you, do you have an account at http://www.hpi.uni-potsdam. de/hirschfeld/squeaksource?

Yes, Eliot Miranda, username eem

...

Best, Tim

On Fri, 7 Oct 2016 at 18:24 Eliot Miranda eliot.miranda@gmail.com wrote:

...
Hi Tim (F),
find attached a minor cleanup of SMark.  I'd like to have remission
to write to the repository if that's ok with you; I don't have it right now.

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid& env=2&revs=50&equid=on

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

-- _,,,^..^,,,_ best, Eliot

-- _,,,^..^,,,_ best, Eliot

Tim Felgentreff

7:18 p.m.

Ok, added you and uploaded your update.

The servers were offline for a while and I was out of the office so I didn't notice, but they're back and right now are running benchmarks for a build from Oct 4.

cheers, Tim

On Mon, 10 Oct 2016 at 18:46 Eliot Miranda eliot.miranda@gmail.com wrote:

...

Hi Tim,

On Mon, Oct 10, 2016 at 6:58 AM, Tim Felgentreff <timfelgentreff@gmail.com

...
wrote:

Hi Eliot,

sure, I can add you, do you have an account at http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource?

Yes, Eliot Miranda, username eem

Best, Tim

On Fri, 7 Oct 2016 at 18:24 Eliot Miranda eliot.miranda@gmail.com wrote:

Hi Tim (F),
find attached a minor cleanup of SMark.  I'd like to have remission to
write to the repository if that's ok with you; I don't have it right now.

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

-- _,,,^..^,,,_ best, Eliot

-- _,,,^..^,,,_ best, Eliot

Eliot Miranda

7:30 p.m.

Thanks!

On Mon, Oct 10, 2016 at 10:18 AM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Ok, added you and uploaded your update.

The servers were offline for a while and I was out of the office so I didn't notice, but they're back and right now are running benchmarks for a build from Oct 4.

cheers, Tim

On Mon, 10 Oct 2016 at 18:46 Eliot Miranda eliot.miranda@gmail.com wrote:

...
Hi Tim,

On Mon, Oct 10, 2016 at 6:58 AM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

Hi Eliot,

sure, I can add you, do you have an account at http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource?

Yes, Eliot Miranda, username eem

Best, Tim

On Fri, 7 Oct 2016 at 18:24 Eliot Miranda eliot.miranda@gmail.com wrote:

Hi Tim (F),
find attached a minor cleanup of SMark.  I'd like to have remission
to write to the repository if that's ok with you; I don't have it right now.

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid& env=2&revs=50&equid=on

If anyone wants to take a look at the code and benchmarks, the about page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

-- _,,,^..^,,,_ best, Eliot

-- _,,,^..^,,,_ best, Eliot

-- _,,,^..^,,,_ best, Eliot

Eliot Miranda

7:40 p.m.

Hi Tim,

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Hi,

I finally got around to enabling automatic nightly benchmarking for all branches for both opensmalltalk-vm and rsqueak. Before I had to manually trigger benchmarks for opensmalltalk-vm. Now a nightly cron job checks for new passed builds on all branches and then tries to run benchmarks for the resulting binaries. Note, however, that we're only uploading binaries for the master and Cog branch right now on opensmalltalk-vm, so all other branches are schedules but then skipped, because no binaries are available.

The benchmarking machine is still running at this point, a full set of benchmarks takes about 4-5 hours per binary. So I expect the results for yesterdays VMs to come in this afternoon GMT+1. Not all of the benchmarks are well scaled right now, so that might need some tweaking, but since there has been talk about various performance optimizations recently, I figured we should be more vigilant in tracking the results of that. For example, Clement mentioned recent optimizations that should have results in binarytrees. It's nice if we can see that in the timeline.

This page in particular may be interesting to check once in a while:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid& env=2&revs=50&equid=on

Would it be possible to add so me indication to the page of what is faster? From the graphs I can't tell if higher is faster or longer. There are no labels :-(

If anyone wants to take a look at the code and benchmarks, the about

...

page links to all relevant repositories. I'm happy to give people access so they can add or change or remove benchmarks.

cheers, Tim

_,,,^..^,,,_ best, Eliot

Levente Uzonyi

8 p.m.

Use the "Changes" table. E.g.: http://speed.squeak.org/changes/?tre=10&rev=2016100420&exe=5&env...

Levente

Bert Freudenberg

8:26 p.m.

On Mon, Oct 10, 2016 at 7:40 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

...
http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2& revs=50&equid=on

Would it be possible to add so me indication to the page of what is faster? From the graphs I can't tell if higher is faster or longer. There are no labels :-(

If you click on one of the tiny charts you get a full chart with labels. In all of the benchmarks, "up" means "more seconds" means "slower".

- Bert -

Tim Felgentreff

9:48 p.m.

Indeed, "up" is "slower", but I will think how to add some indication of that the overview page nonetheless. I guess it isn't in there by default, because the speed center website is agnostic to what kinds of benchmarks you use, and writing it on each tiny chart may look messy. But I'll try it out and see how it looks :)

cheers, Tim

On Mon, 10 Oct 2016 at 20:26 Bert Freudenberg bert@freudenbergs.de wrote:

...

On Mon, Oct 10, 2016 at 7:40 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

Would it be possible to add so me indication to the page of what is faster? From the graphs I can't tell if higher is faster or longer. There are no labels :-(

If you click on one of the tiny charts you get a full chart with labels. In all of the benchmarks, "up" means "more seconds" means "slower".

Bert -

Tim Felgentreff

9:53 p.m.

One issue we might want to discuss is the selection and sizing of the benchmarks.

All benchmarks are autosized to run at least 600ms right now, and then those 600ms runs are repeated up to 100 times to obtain measurements. But 600ms might be too short for some benchmarks, if the GC only kicks in rarely for them, I don't know.

About the selection of benchmarks, the ToolInteraction went up, but that is a very high level benchmark, and might also be influenced heavily by refactorings in Morphic. So maybe the benchmark isn't all that useful, or I should stop tracking trunk with the benchmark images, and instead stay on the release.

What do you think?

On Mon, 10 Oct 2016 at 21:48 Tim Felgentreff timfelgentreff@gmail.com wrote:

...

Indeed, "up" is "slower", but I will think how to add some indication of that the overview page nonetheless. I guess it isn't in there by default, because the speed center website is agnostic to what kinds of benchmarks you use, and writing it on each tiny chart may look messy. But I'll try it out and see how it looks :)

cheers, Tim

On Mon, 10 Oct 2016 at 20:26 Bert Freudenberg bert@freudenbergs.de wrote:

On Mon, Oct 10, 2016 at 7:40 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid&env=2&revs=...

Would it be possible to add so me indication to the page of what is faster? From the graphs I can't tell if higher is faster or longer. There are no labels :-(

If you click on one of the tiny charts you get a full chart with labels. In all of the benchmarks, "up" means "more seconds" means "slower".

Bert -

Levente Uzonyi

9:59 p.m.

If you want to benchmark the VM, then the image should be fix. If you want to benchmark the system, then keep updating the image. The best would be to have both.

Levente

Eliot Miranda

11:33 p.m.

Hi Tim,

On Mon, Oct 10, 2016 at 12:53 PM, Tim Felgentreff timfelgentreff@gmail.com wrote:

...

One issue we might want to discuss is the selection and sizing of the benchmarks.

All benchmarks are autosized to run at least 600ms right now, and then those 600ms runs are repeated up to 100 times to obtain measurements. But 600ms might be too short for some benchmarks, if the GC only kicks in rarely for them, I don't know.

600ms is simply too short. Back in the day when nfib was a popular activation benchmark used to compare different languages (and it's still useful for this today) the approach was to increase the argument to nfib until the result took 30 seconds or more to run. To get activations per second one divided the result of nfib by the time taken.

...

About the selection of benchmarks, the ToolInteraction went up, but that is a very high level benchmark, and might also be influenced heavily by refactorings in Morphic. So maybe the benchmark isn't all that useful, or I should stop tracking trunk with the benchmark images, and instead stay on the release.

The original Smalltalk-80 benchmark suite unclouded senders and implementors whose performance depended not only on the implementation but on the size of the image. I modified the VisualWorks version to scale by the number of methods used. Necessary if one is to measure tool performance, and measuring tool performance is useful in producing a responsive system. But one must be careful to measure something meaningful.

What do you think?

...

Categorising the benchmarks and describing them well is important. Making sure they vary by quality of implementation and not extraneous causes such as code base size, is important. Starting off all benchmarks for a consistent state, and not running a set of benchmarks in the same image (unless one "resets" by throwing away jutted code and GCing) is important. Repeating each benchmark some number of times (e.g. three) and taking the average or the middle, is important. Running benchmarks on an otherwise quiet machine is important.

And when we have Sista, running the benchmarks such that the system can warm up is important. We will want to see baseline and warm performance compared.

On Mon, 10 Oct 2016 at 21:48 Tim Felgentreff timfelgentreff@gmail.com

...

wrote:

...
Indeed, "up" is "slower", but I will think how to add some indication of that the overview page nonetheless. I guess it isn't in there by default, because the speed center website is agnostic to what kinds of benchmarks you use, and writing it on each tiny chart may look messy. But I'll try it out and see how it looks :)

cheers, Tim

On Mon, 10 Oct 2016 at 20:26 Bert Freudenberg bert@freudenbergs.de wrote:

On Mon, Oct 10, 2016 at 7:40 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

On Tue, Jul 26, 2016 at 11:56 PM, Tim Felgentreff < timfelgentreff@gmail.com> wrote:

http://speed.squeak.org/timeline/#/?exe=1,5&ben=grid& env=2&revs=50&equid=on

Would it be possible to add so me indication to the page of what is faster? From the graphs I can't tell if higher is faster or longer. There are no labels :-(

If you click on one of the tiny charts you get a full chart with labels. In all of the benchmarks, "up" means "more seconds" means "slower".

Bert -

-- _,,,^..^,,,_ best, Eliot

tim Rowledge

11:46 p.m.

...

On 10-10-2016, at 2:33 PM, Eliot Miranda eliot.miranda@gmail.com wrote: But one must be careful to measure something meaningful.

Exactly. The best one-liner I know about this is “When a measure becomes a benchmark it cease to be useful measure"

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Thesaurus: Ancient reptile with a truly extensive vocabulary

2758

Age (days ago)

2833

Last active (days ago)

vm-dev@lists.squeakfoundation.org

15 comments

6 participants

tags (0)

participants (6)

Ben Coman
Bert Freudenberg
Eliot Miranda
Levente Uzonyi
Tim Felgentreff
tim Rowledge