How to Write a Good GitHub README Using Data Science

July 5, 2016 opensource , commentary 1 min read

An interesting post from Besir Kurtulmus over on the Algorithmia blog on what makes a good GitHub README?

We set out to flex our data science muscles, and see if we could come up with an objective standard for what makes a good GitHub README using machine learning. The result is the GitHub README Analyzer demo, an experimental tool to algorithmically improve the quality of your GitHub README’s.

Some of our assumptions proved to be true, while some were off. We found that our assumption about headers and the text from paragraphs correlated with popular repositories. However, this wasn’t true for the length of a repository, or the count of code samples and images.

I’m not sure how practical their findings are, but I love the idea of looking at existing patterns to write better software.

This content is open source. Suggest Improvements.


avatar of Brandon Keepers I am Brandon Keepers, and I work at GitHub on making Open Source more approachable, effective, and ubiquitous. I tend to think like an engineer, work like an artist, dream like an astronaut, love like a human, and sleep like a baby.