On 30 May 2017, at 05:43, Ben Coman <btc@openInWorld.com> wrote:

On Tue, May 30, 2017 at 3:25 AM, Eliot Miranda <eliot.miranda@gmail.com> wrote:

Hi Ben,

On Sun, May 28, 2017 at 6:46 AM, Ben Coman <btc@openinworld.com> wrote:

Maybe I'm speaking out of turn from the sidelines, but just curious to ask a leading question...

What is the purpose of the Continuous Integration system?

One thing for me is to identify when I've made a mistake Yes it is selfish of me

Thats fine :). We need you to keep to productive ;)
I'm trying to understand your perspective and needs to see if the workflow can be tuned without slowing you down too much.

to want to break a build, but in some way those of us who are active VM developers must be able to commit and see what fails, especially when we have a large platform and VM variant spread.

Creating a PR automatically runs the CI tests. So you'll get that.

I'd propose your continuous work would occur in a single Cog-eem branch. Pulling from the server-Cog branch would be the same as your current workflow. Pushing to the server as a PR would automatically the CI tests and automatically integrate into the server-Cog branch upon success. There is nothing stopping your work continuing in your Cog-eem branch while the CI tests are running, just the same as you currently continue working in your local-Cog branch.

We can probably devise an efficient way for you to issue the pull request from the command line - something like...
https://github.com/github/hub/issues/1219
https://stackoverflow.com/questions/4037928/can-you-issue-pull-requests-from-the-command-line-on-github

IMO what needs to be added is a way of marking commits as good. I think we should continue to be allowed to break the build (something we do unintentionally).

In my proposal, you break builds in your own Cog-eem branch, and when the CI build succeeds its automatically integrated into the mainline server-Cog branch. If explicit no-ff merge points are maintained, then that may be sufficient to mark a good-build.

But we need a set of criteria to mark a build as fit for release and have a download URL from which we obtain the latest releasable VM.

Marking a good-build as a release-build would be a further and separate step.
btw, I thought "master" branch was going to track "release-builds" ?

I agree with Fabio that if we want the CI server to be as green as possible marking builds which have been failing for a long time as "allowed to fail" is a good idea. But it also has to be brought to people's attention that there are such builds.

I'm not sure the best way to do this. Perhaps upon "travis_sucess" would it be possible to scan the log to report allowed-failures of experimental builds?

As far as the Sista and Lowcode VMs go these are experimental, as would a threaded FFI VM, something we may have later this year.

Maybe these should be separate feature-branches with their own Travis build, which first only tests the feature specific build for quick turn around, and then triggers the production builds for further information.

It would be nice to segregate the standard VMs that are already in production from these experimental builds so that failures within the experimental set don't affect the status of the production VMs. I'd rather see that segregation than mark certain builds as allowed to fail.

The segregation might be done as a dependent build, so it is only started if the production build succeeds.
But I would guess the later is quicker to implement, so can we do that first?
Lets get the builds green asap, then move from there.

@Fabio, what is the actual Travis change that would effect this?

I notice Travis has not had a green build for nearly 120 days
https://travis-ci.org/OpenSmalltalk/opensmalltalk-vm/builds
since Build #603...
https://travis-ci.org/OpenSmalltalk/opensmalltalk-vm/builds/200671152

I'd guess that makes it harder to identify which "new" commits introduce "new" defects.
It was tough for me trying to categorise the historical failures to understand what might be required to make them green.

For example, lately the usual failure has been just a single job...
macos32x64 squeak.sista.spur.
which last worked 22 days ago in Build #713
https://travis-ci.org/OpenSmalltalk/opensmalltalk-vm/builds/228902233
but there are other failures in builds that obscure that fact, way back to #603.
Only an exhaustive search through all builds reveals this.

For example, recent Build #748 has macos32x64 squeak.sista.spur as its only failure
https://travis-ci.org/OpenSmalltalk/opensmalltalk-vm/builds/236010112
but then #750,#751, #752, #753 introduce other failures.

Perhaps a contributing factor is commits being pushed directly to the server Cog branch,
with CI tests only running after they are integrated. I guess that was done to keep the workflow
similar for those moving from subversion to git. However it seems to lose some value of CI
in "preventing" build failures from entering the repository. After a year of using git maybe
this workflow could be reconsidered? Perhaps turn turn on branch protection for administrators
and "force" all commits to the server Cog branch to first pass CI testing?

That's not the reason. then reason is so that we can see if a commit is good across the entire matrix. If there's no way to discover what elements of the matrix are affected by a build then that will reduce my productivity.

You'll get that with a PR. But you'll get test-results before your code is integrated into the server-Cog branch.

Another thing that would reduce productivity would be having to commit to a clone of the CI and then only being able to push to the Cog branch when the clone succeeds, as this would introduce hours of delay.

Could you expand on this? I'm not sure what you mean by "clone of the CI".
If you are continuously working in a Cog-eem branch there is no reason to stop work while the CI tests run. Any further commits pushed in your Cog-eem branch automatically update an open PR and the CI testing automatically runs again and automatically integrates into the server-Cog branch upon success.

Much better is a system of tests that then selects good builds, and something that identifies a good build as one in which the full set of production VMs have passed their tests. And of course one wants to be able to see the converse.

Of course needing to submit everything via PRs adds workflow overhead, but some workflows might minimise the impact.
So I can't argue whether the benefit is worth it, since I'm not working in the VM every day.
I'm just bumping the status quo to table the question for discussion.

I don't see the benefit of submitting everything via PR when the need is to define the latest releasable VM, which needs additional testing and packaging, not just green builds.

Having green incremental commit-builds is a separate issue from having a releasable build. The former provides tighter tracking of breakages. For example, when someone else's commits break the build, this confounds the test results of your own commits and I guess introduces at least some minor delay to work out.

Just because there is a higher-level need, doesn't mean the lesser commit-builds are unimportant. They are a pre-requisite for release-builds. Can we do the easier step first.

cheers -ben

P.S. Should macos32x64 squeak.sista.spur be fixed, or temporarily removed from the build matrix?
A continually failing build seems to serve no purpose, whereas green builds should be a great help to noticing new failures.

I hope it can be both fixed and put in an experimental matrix.

_,,,^..^,,,_
best, Eliot

cheers -ben

P.S just bumped into an interesting (short) article "a few conceptual difficulties really stand out."
https://stevebennett.me/2014/02/26/git-what-they-didnt-tell-you/

P.P.S. I think in a really ideal system, once Travis vetted the final commit of the PR and was ready to merge, it would go back and test each interposing commit and squash it if its build failed, so that every commit in the repo was a green build. But I guess that is just dreaming.