June 4th, 2010

Rack Middleware Use Case Examples

At the Prague Ruby meetup on Wednesday, I had a talk about Rack and middleware. In trying to deliver as fast as possible, I kinda forgot the main point of the talk: explaining why it matters. So let’s try to remedy it a bit in this way, at the very least.

Rack

Let’s briefly state the obvious so we get a starting point. So, what is Rack? Rack is this:

require 'rubygems'
require 'rack'

# A valid Rack application is an object responding to `call` method, which takes a Hash
# as an argument and returns an array of `[status, headers, body]`.

app = proc { |env|
  [
  200,                                          # Status
  {'Content-Type' => 'text/plain'},             # Headers
  ["Hello World!"]                              # Body
  ]
}

Rack::Handler::Thin.run app

Is Rack just some piece of Ruby code, then? Sure, but that’s not the point. First of all, Rack is an interface between a Ruby code and a web server, abstracting stuff like handling the request/response, etc. Rack is a set of conventions and a spec, which any Ruby web framework can (should?) conform to, and get the low level stuff for free.

Rack Middleware

Then there’s the middleware concept. Rack applications are usually built as a stack of small applications, called middleware, which pass requests and responses through their „chain“. You should have a look at the „Rack & Metal“ presentation from Gregg Pollack to get some mental picture. Right now, looking at this one will do:

Rack Middleware stack

You can see how the initial request and output of every item in the „stack“ or „chain“ is being passed along. Middleware can manipulate the request, the response, it can halt the processing altogether, or do completely unrelated stuff, like logging. Lots of useful middleware is bundled with Rack itself (CommonLogger, Auth::Basic, …), others are available as part of Rack contrib repository and yet many more at the http://coderack.org website.

I may have over-emphasized the win factor of using available middlewares like Rack::Cache or Rack::Throttle in the talk, and it worries me. Of course it’s nice not having to do some hard work. But that’s certainly not the point. Writing your own middleware could be very, very useful as well and I should have said why exactly.

Writing your own middleware

But first — how difficult is writing your own middleware? Easy:

In this example the middleware inserts the information about how much time the response took into the response body.

As you can see, it’s just a normal Ruby class, which gets passed the application which uses it (@app), in its constructor, and which has a call method. In our case, it first gets the status, headers and body information from the underlying application (or middleware in front of it). Second, it inserts a little piece of HTML into the body, recalculates the body length and passes everything along. It also prints some debug output on STDOUT.

You can download the middleware-all-in-one-example.rb file to see it in action, along with another example.

Use cases

One type of use cases for Rack middleware could be something like this:

As you see, it’s a middleware which inserts information about the last revision in Git into a response body. It either calculates the path to the application or takes one as argument in the constructor, extracts the info in the git_revision_info method, formats it in the message method, updates the response in the call method and passes everything further down the stack. Check out and run the example application to see how it works. (Obviously, in real world you’d insert stuff before <body> with something like Nokogiri, get info from Git when loading the application and not per every request, etc., but for the sake of simplicity, we cut some corners here.)

If we want to use a middleware like this in Rails, we would just put something like this in our config/environments/staging.rb or similar (note: for the sake of simplicity, this middleware is not compatible with Rails’ ActionDispatch::Response):

require 'lib/gitinfo'
config.middleware.use Rack::GitInfo

Of course, we could put the logic in a plugin (or „plugem“) as well, having some helper method, and using it somehow like this:

# In app/views/layouts/application.html.erb
<%= print_git_revision_info if staging? %>

That works. But the logic is tied to a Rails application, and we cannot easily use it in, say, a Sinatra application. When we have the logic bundled in a middleware, we can just drop it in any Rack-compatible web application and we are done.

The example above is a very trivial one, on purpose. But it illustrates the important point: taking a specific responsibility, separating it from the rest, and putting it into a specific piece of codebase. Following the same pattern, you can easily take on issues like audit logging, firing asynchronous hooks, responding to heavily loaded „gimme current status of something“ API calls, redirecting to old API versions and many more.

A Real World Example

Another example is one we have to solve at my current contract.

Let’s say you have a web service, which encapsulates third-party APIs, and one of them is Google Maps Data API. This particular API enforces a limit on the number of requests you can make within certain time. When you receive 620 response code, you have to back off for a specific amount of time — otherwise, you’re cut off.

This means that you have to watch out for such cases, back off the Google web service and pass the information through your web service to the end client. After the specified timeout, you can start hitting Google’s servers again.

When you think about it, it’s rather common scenario, where a webservice implements a throttling strategy like this. You could code support for Google Maps Data API specifically, but you would be writing all of this code, all over again, for the first next API with a throttling strategy. A well-written middleware can support most such cases rather elegantly.

Moreover, you could very well put the implementation directly in your application or library code. But that means a if..else or case hell may be just waiting to happen there. It may put quite a lot of mental strain on anyone reading the code, figuring what is going on. It may bound the functionality very tightly with the main functionality.

A dedicated middleware would make even more sense when you encounter a webservice with an uncommon throttling strategy, such as the Czech companies registry, which allows 1000 requests „during the day“ and 5000 „during the night“. In this case, you can build a specific middleware, based upon some ServiceThrottled::Base.

In this way, we can implement the solution for the „we have to deal with different throttling strategies“ problem in such a way that:

  • the responsibilites of different parts of code are clearly divided,
  • so you can test it separately as well,
  • and the main application/library file stays lean and clear,
  • the code is re-usable in any Rack-compatible application,
  • and inherently modular, ie. extendable in a sane way.

As more and more low-level stuff is „solved“ in programming, our tasks, are developers, are more and more about finding the solution which properly divides the responsibilities in the code-base, which increases maintainability of the code, and decreases dependency of one part of the code on another („coupling“). In fact, our job is becoming less and less about writing some code which „gets the job done“ and more about creating the abstract representation of the problem, in code. As Frederick Brooks writes in The Tar Pit:

Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures.


If you have a nice example where Rack middelware fits just fine, please do share the experience in comments, blog posts, Twitter, whatever.





NOTE: In case you’d like to point out that adding middleware in the stack affects the application throughput, that’s certainly true. You could of course introduce some hard to pin down bottlenecks in your code this way. I’ve created some simple benchmark suite if you’d like to test things out, but the end-result is that even if you mash more than a dozen middlewares in your app, it adds some fifty millisecons to the average response time.

blog comments powered by Disqus