opensoul.org

ETags with memcached

I love ETags, but there’s something that annoys me: most implementations revolve around pulling a record out of a data store and only “rendering” the response if it hasn’t been modified.

For example, here is the standard way to implement ETags in Rails:

def show
  @article = Article.find(params[:id])

  if stale?(:last_modified => @article.published_at.utc, :etag => @article)
    render :json => @article
  end
end

The problem with this approach is that request has already gone through most of your application stack–parsing params, authentication, authorization, a few database lookups–so ETags are only saving you render time and some bandwidth.

While working on a Sinatra-based JSON web service that gets very heavy traffic, I wanted to find a way to short-circuit requests and avoid most of the stack if a resource hasn’t been modified.

Creating the ETag

Since the application is a fairly RESTful web service, it was easy to make the decision that ETags should be tied to models, and invalidated when a record is updated.

I included this module into all of the models:

module Etag
  def self.included(base)
    base.after_update :etag!
  end

  # Fetch the current etag
  def etag
    value = $memcache[etag_key] || etag!
    "#{etag_key}:#{value}"
  end

  # Change the etag
  def etag!
    $memcache[etag_key] = ActiveSupport::SecureRandom.base64(16)
  end

  def etag_key
    Digest::SHA1.hexdigest("#{self.class.name}-#{id}")
  end
end

This gives us an #etag method, which stores a random string in a memcache key that is a digest of the class name and id, and returns the key and value. Every time the #etag method is called, it will return the same key and the value from memcached.

After the model is updated, the #etag! method is called which stores a new random string in the memcached key.

Responding with the ETag

Now in the Sinatra application I just return the ETag for the model that is being rendered.

get '/games/:id' do
  game = current_user.games.find(params[:id])
  etag game.etag
  game.to_json
end

Checking the ETag

Subsequent requests will contain the ETag, so now it just needs verified using a rack middleware:

class EtagMiddleware < Rack::Auth::AbstractHandler
  class Request < Rack::Auth::AbstractRequest
    def cachable?
      etag.present?
    end

    def etag
      @etag ||= @env['HTTP_IF_NONE_MATCH'] && @env['HTTP_IF_NONE_MATCH'].gsub(/^"(.*)"$/, '\1')
    end

    def modified?
      key, value = etag.to_s.split(':', 2)
      $memcache[key] == value
    end
  end

  def call(env)
    request = Request.new(env)

    if request.cachable? && !request.modified?
      [304, {'ETag' => request.etag, 'Cache-Control' => 'private'}, []]
    else
      @app.call(env)
    end
  end
end

I’m being tricky with the ETag and setting it to a key/value pair. So the middleware just has to look up the key in memcached and see if the value matches. If it does, we know the resource hasn’t changed and can render a 304. If the value doesn’t match, then the resource has been updated and we process the request as normal.

This approach worked really well for us. It is easy to implement on new resources, and unmodified requests only take a few milliseconds.

http, performance, ruby, and sinatra January 29, 2011

3 Comments

  1. jason jason January 30, 2011

    Thank you for this!

    Quick question. Besides the obvious weight differences of the frameworks, Rails should be (almost) equally effective at passing on an eTag without bothering its full stack?

    Also, do you have any opinion or concern about using Redis as an alternative to memecache in this instance?

  2. Brandon Keepers Brandon Keepers January 31, 2011

    Jason:  Yeah, this should work just as well with Rails. You can still use the middleware and module, and do this in your action:

    def show
      game = current_user.games.find(params[:id])
      fresh_when :etag =&gt; game.etag
      render :json =&gt; game
    end
    

    You shouldn’t have a problem using Redis, but I know we’ve had a much easier time scaling memcached than Redis.  Redis seems to choke when we give it too much data.

  3. Jon Wood Jon Wood February 17, 2011

    Thanks! I was trying to hunt it down the other day, and then stumbled upon it again when I found your post on Cucumber and Sunspot – I’m going to try implementing it in an application hosted on Heroku this evening to reduce load.

Post a Comment

Comments use textile. Anonymous comments will be deleted.

My name is Brandon Keepers. I like to build things, usually in Ruby or JavaScript. I work at GitHub and live in Holland, MI.

Popular Posts