Hi Ben,

On Sun, May 28, 2017 at 6:46 AM, Ben Coman <btc@openinworld.com> wrote:

Maybe I'm speaking out of turn from the sidelines, but just curious to ask a leading question... 

What is the purpose of the Continuous Integration system?

One thing for me is to identify when I've made a mistake  Yes it is selfish of me to want to break a build, but in some way those of us who are active VM developers must be able to commit and see what fails, especially when we have a large platform and VM variant spread.

IMO what needs to be added is a way of marking commits as good.  I think we should continue to be allowed to break the build (something we do unintentionally).  But we need a set of criteria to mark a build as fit for release and have a download URL from which we obtain the latest releasable VM.

I agree with Fabio that if we want the CI server to be as green as possible marking builds which have been failing for a long time as "allowed to fail" is a good idea.  But it also has to be brought to people's attention that there are such builds.

As far as the Sista and Locoed VMs go these are experimental, as would a threaded FFI VM, something we may have later this year.  It would be nice to segregate the standard VMs that are already in production from these experimental builds so that failures within the experimental set don't affect the status of the production VMs.  I'd rather se that segregation than mark certain builds as allowed to fail.

I notice Travis has not had a green build for nearly 120 days
since Build #603...

I'd guess that makes it harder to identify which "new" commits introduce "new" defects.
It was tough for me trying to categorise the historical failures to understand what might be required to make them green.  

For example, lately the usual failure has been just a single job... 
    macos32x64 squeak.sista.spur.
which last worked 22 days ago in Build #713 
but there are other failures in builds that obscure that fact, way back to #603.
Only an exhaustive search through all builds reveals this.

For example, recent Build #748 has macos32x64 squeak.sista.spur as its only failure 
but then #750,#751, #752, #753 introduce other failures.

Perhaps a contributing factor is commits being pushed directly to the server Cog branch, 
with CI tests only running after they are integrated.  I guess that was done to keep the workflow
similar for those moving from subversion to git.   However it seems to lose some value of CI
in "preventing" build failures from entering the repository.   After a year of using git maybe 
this workflow could be reconsidered? Perhaps turn turn on branch protection for administrators
and "force" all commits to the server Cog branch to first pass CI testing? 

That's not the reason.  then reason is so that we can see if a commit is good across the entire matrix.  If there's no way to discover what elements of the matrix are affected by a build then that will reduce my productivity.

Another thing that would reduce productivity would be having to commit to a clone of the CI and then only being able to push to the Cog branch when the clone succeeds, as this would introduce hours of delay.  Much better is a system of tests that then selects good builds, and something that identifies a good build as one in which the full set of production VMs have passed their tests.  And of course one wants to be able to see the converse.
Of course needing to submit everything via PRs adds workflow overhead, but some workflows might minimise the impact.
So I can't argue whether the benefit is worth it, since I'm not working in the VM every day. 
I'm just bumping the status quo to table the question for discussion. 

I don't see the benefit of submitting everything via PR when the need is to define the latest releasable VM, which needs additional testing and packaging, not just green builds.


cheers -ben

P.S. Should macos32x64 squeak.sista.spur be fixed, or temporarily removed from the build matrix?
A continually failing build seems to serve no purpose, whereas green builds should be a great help to noticing new failures. 

I hope it can be both fixed and put in an experimental matrix. 

best, Eliot