Hi Ben, Hi All,
I'm quite conservative when it comes to relying on others' infrastructure so I need some help making me take the plunge. Please see below:
On Thu, Dec 17, 2015 at 7:15 AM, Ben Coman btc@openinworld.com wrote:
On Wed, Dec 16, 2015 at 1:00 AM, Ben Coman btc@openinworld.com wrote:
On Wed, Dec 16, 2015 at 10:43 AM, Ryan Macnak rmacnak@gmail.com
wrote:
What would be more helpful is if the VM build was fixed to work with
a cross compiler, so it would compile fast enough to test ARM and MIPS on Travis CI alongside IA32 and X64.
It would also help if the top-of-tree Intel VMs were always kept
working so we'd know which change broke something. Moving the Subversion repository to a more reliable host (which likely means migrating to Git) would also cut down on the false positives Travis reports because the Subversion server has a habit of dropping connections.
+1 github :)
btw, Did you know that github supports subversion clients since 2011 [1]? Here are supported features [2]. Are these sufficient for your current svn workflows? Potentially we could have ONE repository and those liking subversion can stick with it and those liking git can use that. Of course, this would need to be proven.
Ah, that's interesting. So my concern is whether github is a safe long-term bet. Specifically what is there to prevent some third party from buying github, or of github going public and the board taking the decision, or github on its own, deciding to charge for hosting, keeping the data hostage to extract payment? What safeguards are in place to prevent this? I'm not interested in "this will never happen" arguments. I'm interested in hard data please.
[4] Provides pragmatic advice for cutting over. Esteban appears to
have done similar to step 1 and 2 [3] - but it seem sometimes his modifications directly update this mirror so its not clear to see when that branch is an *exact* copy of the current svn trunk. So I'd love to see a github repository that is always an *exact* mirror of the svn repository, with any pharo mods occurring in a branch off that. Even better if the repository for svn users resides on github in place of that mirror.
I've been googling around for problems reported using github via an svn client, and haven't found any smoking guns. Is this something we can trial? I'm willing to put some effort into it. A key requirement would be not interrupting Eliots work on Spur-64. Potentially we could stay for months on step 3 [4] with the CI infrastructure running on the git side, but code check-ins continuing onthe svn side.
btw2, [5] provides a use case for the advantages of a full switch.
cheers -ben
[1] https://github.com/blog/966-improved-subversion-client-support [2] https://help.github.com/articles/support-for-subversion-clients/ [3] https://github.com/pharo-project/pharo-vm/network [4] http://blogs.atlassian.com/2013/01/atlassian-svn-to-git-migration-technical-... [5] http://blogs.atlassian.com/2013/01/svn-to-git-how-atlassian-made-the-switch-...
On 17 Dec 2015, at 18:29, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi Ben, Hi All,
Ah, that's interesting. So my concern is whether github is a safe long-term bet. Specifically what is there to prevent some third party from buying github, or of github going public and the board taking the decision, or github on its own, deciding to charge for hosting, keeping the data hostage to extract payment? What safeguards are in place to prevent this? I'm not interested in "this will never happen" arguments. I'm interested in hard data please.
1.) You will lose bug reports (if you decide to use bugtracker) 2.) You will lose comments/discussion on pull requests 3.) You might lose the wiki content 4.) Unless you and nobody else in this community has the git tree you lose the history of the project.
1.) You might decide not to use their bug tracker? 2.) You might decide not to use the pull request workflow or risk losing some context that is outside the commit message, change. 3.) Don't use it then. 4.) One can mitigate by either automatically synchronising the repo to another place or by having the primary somewhere else (which makes 2nd more hard than it should be).
=> long term. Keep a backup of the repo (and with git you always have that anyway) and if they kick everyone out, push it to another server.
I hope this helps.
holger
PS: For my C level GSM stuff we run our own git infrastructure on git.osmocom.org and github is mirroring some of the repos to github.com/osmocom. This way people can discover our sources more easily, we discover 'forks' but right now we don't use the pull request system at all.
Another possibility is to not use gihub but still use git on our own server. Obvious trade-offs, control/ease etc. I imagine we could set up a git thingy on squeak.org without too much trouble.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim It is easier to change the specification to fit the program than vice versa.
" Specifically what is there to prevent some third party from buying github, or of github going public and the board taking the decision, or github on its own, deciding to charge for hosting, keeping the data hostage to extract payment? What safeguards are in place to prevent this? I'm not interested in "this will never happen" arguments. I'm interested in hard data please."
On the same note
Specifically what is there to prevent a new fork or a new language appearing that make people mostly abandon if not completely abandon Pharo and Squeak ? What safeguards are the there to prevent this ? I am not interested in "this will never happen" arguments. I'm interested in hard data please.
Actually my scenario is far more likely than yours but lets say your happen. First you need to realise that Github already gives unlimited file size for every github repo and they have have been without exaggeration the roof for most open source software out there. But lets say this happens.
Even if that happens there will be like a ton of candidates out there to jump on the opportunity to do what github already does . Why ? Money and of course popularity. And no most likely you wont lose your github data. First of all Github already offers an API that allow you to get all sort of Github data and not only that you can do things that git is normally doing. For example you can commit, merge pull request, resolve conflicts all that without even having git installed in your system. But even without taking that into consideration the new candidates most likely will offer some kind of immigration of project from github to their website.
In any case its care to note here that not every company sucks big time as Google does, and not every company is kings of abandonware like google is. Github has been listening and respecting its users for a long time. In the end everything dies or at least gets far less popular, this will happen to github as it happened to Squeak as it will happen to Pharo and as it happens to tons of diffirent kinds of technology. Anyone that things that is immune , is crazy or knows something we dont.
The reason for moving to github based on hard data ? the undeniable fact that github is awesome should be enough. Nothing lasts forever, deal with it.
On Thu, Dec 17, 2015 at 8:33 PM tim Rowledge tim@rowledge.org wrote:
Another possibility is to not use gihub but still use git on our own server. Obvious trade-offs, control/ease etc. I imagine we could set up a git thingy on squeak.org without too much trouble.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim It is easier to change the specification to fit the program than vice versa.
On Thu, Dec 17, 2015 at 9:29 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
Ah, that's interesting. So my concern is whether github is a safe long-term bet. Specifically what is there to prevent some third party from buying github, or of github going public and the board taking the decision, or github on its own, deciding to charge for hosting, keeping the data hostage to extract payment? What safeguards are in place to prevent this? I'm not interested in "this will never happen" arguments. I'm interested in hard data please.
This sounds like a risk management problem. We want to minimize the risk that we lose access to the source code and it's history, right? Is there other data that you are concerned about?
With regard to GitHub, I think these are the interesting questions:
1. What are the chances that GitHub will stop providing free hosting to open source projects? 2. What are the consequences if #1 occurs? 3. What can we do about it?
First, let's look at #1. This sort of thing does happen. Holding data hostage is unusual, but free online services get shut down all the time. What might cause *Github* to do it?
Could they be forced to cut expenses? Github has been around for almost 8 years, and have stuck with their model of "free public repositories, pay for privacy" throughout that time. It seems to be working for them. Three years ago one of their investors said they've been profitable over most of their life, and are growing revenue at 300% per year[1]. This summer, they raised $250 million more, with the company valued at $2 billion[2]. That indicates that they're still growing quickly, and think they'll be able to expand into new markets. So running out of money and dropping free hosting as a way to cut costs seems unlikely.
How about a change in control? Maybe Oracle will buy them and squeeze as much profit out of them as possible before tossing the dry husk away. For that to happen, the offer would have to be spectacular. Github's investors need at least a 10x return, and probably more, to make money for their funds. If they were worth $2 billion this summer, the acquisition price would have to be something like $20-50 billion. That just doesn't allow the buyer much room to maneuver. There's no special technology behind Github that would make sense to to acquire at that price. Github's value is entirely in market position, customer relationships, goodwill etc. To make back the money, the buyer would need to keep running Github and keep earning revenue from it.
Going public? Even less likely. Because of regulatory changes, tech companies have been waiting longer to go public and doing so at a much higher valuation. (Lots of different takes on this, but see eg. [3]) If Github went public, it would be because its valuation was so high that employees and investors wanted to (more easily) sell some shares and enjoy their wealth. That would be a huge endorsement of the business model and current management team. With few investors—only five so far[4]—the founders would undoubtedly retain control, similar to the IPOs of Google and Facebook. Messing with the business model would be unthinkable at that point.
What if Github decided to change strategies without some sort of external impetus? That seems unlikely as well. The economics underlying the freemium strategy are getting more and more compelling over time. Disks are cheap, and the cost of storage keeps going down. I just ran across a new cloud storage service that charges half-a-cent per GB per month[5]. Computing power is also getting cheaper, and with cluster managers like Mesos and Kubernetes, we're using it more efficiently as well. The "burden" of providing free hosting is low and will be getting lower as time goes on.
On the other hand, Github is *the* go-to place for hosting source code. There are millions of users that have both free public repositories and paid private ones. (Github reports 12 million users[6], and I bet a large fraction of them at least have access to both public and private repositories.) Taking away the free repositories would alienate a LOT of customers, and hurt revenue.
So, without saying "this will never happen," I will say that Github shutting down free hosting would be unlikely.
Alright, let's look at #2. If the unlikely did happen, what would be the consequences?
As others have mentioned, the architecture of git makes it impossible to hold the source code and history hostage. Everyone who clones a git repository has a complete copy of the data. If they decided to lock everyone out of the repositories we'd just get another server and do this:
cd coggit remote add origin git://git.squeak.org/cog.git git push origin master
At the same time, we'd be in good company. Github currently has 30 million repositories[6]. Let's be really generous and say that half of those are private, and thus paid-for and exempt from hostage-taking. That means 15 million repositories are now subject to extortion from Github. Sure, most of those are personal forks with no significant changes. But even if there were only, say, 100,000 "real" repositories, that would be a *cataclysm* for the open source world. Alternate hosting would be popping up all over the place, and whatever inconvenience we might have about moving would be quickly solved by larger and richer open source projects. It wouldn't take much more than "here's our new git hosting" posted on the mailing list and squeak.org to make the change, because *everybody* would know about the problem.
Finally, #3, what can we do about it?
Well, in terms of influencing Github's business model, nothing. We have no leverage. So #1 is out of our control.
But, there are a few things we can do to improve #2. First, we could mirror all commits to another repository. That could be a Github competitor, like BitBucket, or just a server that we host with Rackspace or whatever, or even "offline" storage like S3. I believe the Pharo folks are already mirroring the VM source, from the current hosting, so that helps reduce the risk as well.
Second, we could move more of the VM source into Smalltalk. That might mean generating more of the source files with VM maker, running builds from within the image instead of using CMake etc. It probably wouldn't be worth it to make *all* the platform sources versioned in MC, but we could go further in that direction from where we are now.
Finally, if it really did come down to Github holding the sources hostage and we had no other copies, we could just pay up. Currently, their cheapest plan is $7/month for 5 private repositories, which ought to cover our needs. Even with the meager donations that Squeak attracts today, surely we could raise $85 to get a year of paid hosting, and use that time to figure out what to do for the long term. Github might raise their prices (Why not? This scenario already has them being suicidally irrational.), but I can't see them exceeding our fundraising capabilities. What's the point of extortion if the victim can't pay?
(As a side note, I would be shocked if hosting squeakvm.org currently costs less than $7/month. No idea who's paying for it, but how confident are we that they'll continue to do so?)
In summary, Github is a very safe bet. Your nightmare scenario involves a series of very improbable events: Github would have to stop offering free hosting. They'd have to actively alienate their paying customers by holding their source code hostage. There would have to be sudden disk failures on dozens of laptops and servers where the repository is cloned. And to top it all off, the larger Squeak community, including Pharo, Cuis, Newspeak, Scratch and Croquet would have to be unable to come up with a few dozen dollars to pay for the hosting.
This will never happen.
Colin
[1] http://peter.a16z.com/2012/07/09/software-eats-software-development/ [2] http://fortune.com/2015/07/29/github-raises-250-million-in-new-funding-now-v... [3] http://www.forbes.com/sites/samanthasharf/2014/12/24/is-the-ipo-outmoded-why... [4] https://www.crunchbase.com/organization/github/investors [5] https://www.backblaze.com/b2/cloud-storage.html [6] https://github.com/about/
vm-dev@lists.squeakfoundation.org