Mike Ashley


January 02, 2016

I’ve been contributing to the Sovereign project which is hosted on GitHub. I was an early user of GitHub but deleted my account in 2014 as part of rejecting Web 2.0 social networking. I had to rejoin to contribute fixes to Sovereign. I still don’t like GitHub, but I’m learning how to give up as little content as possible to it.

First of all, I love git. It’s one of the most profound collaboration tools ever invented. It’s made an enormous, positive impact on the way I work for myself and with others. Building is an evolutionary process with false starts and eventual victories. Git helps you communicate this process in logical steps that makes it easier for the audience to understand how you got from point A to point B. Finally, since it is distributed, there is no central control of content. Everybody has a full copy.

GitHub’s design discourages habits that make git powerful for development. Pull request and issue tracking workflows are the worst offenders.

Pull request workflow

GitHub’s pull request workflow invites users to document changes on GitHub instead of with commit logs and documentation in the repository. Later, if a developer not using GitHub looks at a series of changes, it can be difficult to understand why changes were made and what tradeoffs were considered.

This especially hurts open source software projects. Maintenance is neither a glamorous nor easy job, and poor documentation of why changes are made doesn’t help developers. Also, open source projects often suffer from design incoherency, and again, not documenting why features are designed the way they are, in the repository, doesn’t help future developers.

This is the easiest to get right. Developers need to document their changes with good commit messages and write basic design documents that are kept with the repository. GitHub hurts this by providing another means for documenting changes. Maintainers must expect and enforce discipline if they want their project history to be robust.

Issue tracking

Issues need discussion in order to disposition them,and these discussions are valuable. They uncover design misses, hidden design intent, subtle implementation errors, and more. It’s frustrating that issues are kept in separate databases. GitHub is not a unique offender. Bugzilla, Trac, Jira…they all have this problem.

I am not advocating for issues to be tracked in git although it’s interesting to think about possible designs. There are too many requirements for issue tracking systems where I don’t think keeping the issues with the git repository makes sense.

The only reasonable way I can see to handle this is that the developer addressing an issue must take responsibility for consolidating the discussion around an issue and making sure the essence is captured in commit logs and design documentation when the issue is addressed. Again, GitHub hurts this by making it all too easy for a developer to reference an issue number in a commit message summary and move on. Maintainers again have to expect and enforce discipline if they want good history on their projects.

The iPhoto comparison

Fundamentally I am asking that metadata be kept with content. Eventually, metadata separate from the content will get lost, and companies use metadata as a way to lock in customers.

As an example from consumer space, iPhoto (at least older versions of it) would keep metadata separate from the pictures. If you tagged a picture or fixed a date, the data was not pushed into EXIF tags in the picture but instead stored in a separate database. Of course, if you wanted to leave iPhoto, you were faced with the problem of losing the metadata you worked hard to create. I don’t know why Apple’s engineers chose the design they did, but it made it hard to stop using iPhoto.

Project maintainers that host on GitHub and keep their metadata on GitHub instead of in the git repository are locking themselves in. Maintainers should set an expectation to avoid this, but everyone contributing to projects can take it on themselves to not reinforce it.