opensoul.org

ETags with memcached

January 29, 2011 code 5 min read

I love ETags, but there’s something that annoys me: most implementations revolve around pulling a record out of a data store and only “rendering” the response if it hasn’t been modified.

For example, here is the standard way to implement ETags in Rails:

def show
  @article = Article.find(params[:id])

  if stale?(:last_modified => @article.published_at.utc, :etag => @article)
    render :json => @article
  end
end

The problem with this approach is that request has already gone through most of your application stack–parsing params, authentication, authorization, a few database lookups–so ETags are only saving you render time and some bandwidth.

While working on a Sinatra-based JSON web service that gets very heavy traffic, I wanted to find a way to short-circuit requests and avoid most of the stack if a resource hasn’t been modified.

Creating the ETag

Since the application is a fairly RESTful web service, it was easy to make the decision that ETags should be tied to models, and invalidated when a record is updated.

I included this module into all of the models:

module Etag
  def self.included(base)
    base.after_update :etag!
  end

  # Fetch the current etag
  def etag
    value = $memcache[etag_key] || etag!
    "#{etag_key}:#{value}"
  end

  # Change the etag
  def etag!
    $memcache[etag_key] = ActiveSupport::SecureRandom.base64(16)
  end

  def etag_key
    Digest::SHA1.hexdigest("#{self.class.name}-#{id}")
  end
end

This gives us an #etag method, which stores a random string in a memcache key that is a digest of the class name and id, and returns the key and value. Every time the #etag method is called, it will return the same key and the value from memcached.

After the model is updated, the #etag! method is called which stores a new random string in the memcached key.

Responding with the ETag

Now in the Sinatra application I just return the ETag for the model that is being rendered.

get '/games/:id' do
  game = current_user.games.find(params[:id])
  etag game.etag
  game.to_json
end

Checking the ETag

Subsequent requests will contain the ETag, so now it just needs verified using a rack middleware:

class EtagMiddleware < Rack::Auth::AbstractHandler
  class Request < Rack::Auth::AbstractRequest
    def cachable?
      etag.present?
    end

    def etag
      @etag ||= @env['HTTP_IF_NONE_MATCH'] && @env['HTTP_IF_NONE_MATCH'].gsub(/^"(.*)"$/, '\1')
    end

    def modified?
      key, value = etag.to_s.split(':', 2)
      $memcache[key] == value
    end
  end

  def call(env)
    request = Request.new(env)

    if request.cachable? && !request.modified?
      [304, {'ETag' => request.etag, 'Cache-Control' => 'private'}, []]
    else
      @app.call(env)
    end
  end
end

I’m being tricky with the ETag and setting it to a key/value pair. So the middleware just has to look up the key in memcached and see if the value matches. If it does, we know the resource hasn’t changed and can render a 304. If the value doesn’t match, then the resource has been updated and we process the request as normal.

This approach worked really well for us. It is easy to implement on new resources, and unmodified requests only take a few milliseconds.

This content is open source. Suggest Improvements.

@bkeepers

avatar of Brandon Keepers I am Brandon Keepers, and I work at GitHub on making Open Source more approachable, effective, and ubiquitous. I tend to think like an engineer, work like an artist, dream like an astronaut, love like a human, and sleep like a baby.