opensoul.org

Git: the NoSQL database

September 1, 2011 popular , code , talk 4 min read

We all know that Git is pretty amazing. It’s fast, reliable, flexible, and it keeps our project history safely nuzzled in it’s cozy object database while we sleep soundly at night. But I’m curious to see if it can be used for more than code. I’ve had a few apps in the back of my mind for a while now that would be really interesting if the data was stored in Git.

If only there was an easy way to read and write a Git repo from Ruby…

Toystore & adapter-git

Toystore is an ActiveModel-based object mapper for key-value data stores. The beauty of Toystore is that it doesn’t care what the backend is. It uses Adapter to abstract the connection to any data store that can set, get, and delete keys.

Well, Git is a key-value store; it supports set, get and delete on keys (a.k.a. paths). So I sat down with Scott Chacon’s Git Internals Peepcode PDF and put together adapter-git, built on top of Grit.

Now I can create pretty models that are stored in Git.

class Item
  include Toy::Store

  store :git, Gaskit.repo,
    :branch => 'content',
    :path   => 'items'

  attribute :title,       String
  attribute :description, String
  attribute :created_at   Time, :default => lambda { Time.now }
end

Toystore uses conventions that will be familiar to anyone that has used Active Record or MongoMapper.

item = Item.create!(:title => 'Git: the NoSQL database')
item.update_attributes(:description => "OMG this is awesome!")

The biggest difference is that you can’t “find” records. The data stored in a key-value store is opaque, so all you can do is get it by key.

item = Item.get!('3FB053FA-0A3B-4903-9CE0-2A8A964E0F37')

Caveats

I have no idea if Git will work as a data backend for an application. I’m sure GitHub has solved many of the problems with concurrency and scaling a filesystem.

There is still a lot of room for improvement on adapter-git too. Here are just a few things I’d like to add soon:

  • Locking - I wouldn’t want to use adapter-git for anything with a lot of concurrency at the moment. Git commits are atomic; you’ll never corrupt a repo by a failed commit, but if you have a lot of concurrent access, you might loose commits.
  • Custom commit messages - Currently adapter-git just uses the ID of the key being set in the commit message. In the app I’m experimenting with, I’ve already had a desire to set custom commit messages.
  • Update working copy - adapter-git currently just works against the git repo itself. It doesn’t update your working copy. So `git status` will currently tell you that you’ve deleted files from your working copy after you update records with adapter-git.
  • Merge conflicts - I’m looking forward to being able to programmatically resolve merge conflicts. Riak has a cool pattern for resolving conflicts on read, so I’d love to see if I can build something into adapter-git and toystore to work in a similar fashion.

Check out adapter-git on GitHub and try building an app backed by Git!

Update: Check out the video and slides for a talk I gave on this topic.

This content is open source. Suggest Improvements.

@bkeepers

avatar of Brandon Keepers I am Brandon Keepers, and I work at GitHub on making Open Source more approachable, effective, and ubiquitous. I tend to think like an engineer, work like an artist, dream like an astronaut, love like a human, and sleep like a baby.