Scalable and performant Ajax using Grails and Jetty

Ajax is a popular technique for creating highly interactive websites and web applications. It is based on polling over an HTTP connection; and often involves a client regularly polling a server to see if it has any updates for it. This is a well–known technique, and I won’t attempt to cover exactly how this is done, as there are lots of JavaScript libraries out there to help.

This scenario is the one I am dealing with in this article. The question is how to get great performance out of the technology? Regular browser polling, when implemented naively, can become a troublesome scalability problem very easily, and one that will only show up in some serious load tests / network fail tests (you do these, right?). How to prevent this?

Imagine the scenario, you’ve built your app, the client JavaScript is set to poll every 5 seconds, and the server responds in about 100 ms for each connection. Not bad, all seems fine. You deploy, go live and start getting users on.

You may have noted that the above scenario introduces up to 5 seconds of latency between the server knowing something and the client being informed on its next poll. Not the kind of performance we expect in a new application, surely! You try to reduce the timings lower, to give you better latency; now, this may or may not work, although in this scenario you better hope that your server keeps up with servicing the requests! If not then you’ll soon DDOS you own server (and database, and router etc) as connection requests start stacking up.

Now, a quick bit of maths. Each client polls every 5 seconds, and takes 100 ms to process, so you can deal with (averaging out), 50 clients from a single thread. Hmmm, that’s not a lot. Oh well, ramp up the thread pool, say 50 (a common default), this gives you a maximum of 2500 clients on your server, before your connection time starts to suffer solely due to thread contention (ignoring any other capacity issues).

Now, in our polling example (lets make it an email programme), our clients will be checking for mail every 5 seconds (our ajax poll), and will more than likely be told that nothing has changed.

So, you have 2500 clients for your server, most of which will receive updates on the order of minutes, and they have the effect of hammering your server (and database, and router etc) with essentially spurious requests (check your load graphs, they won’t be pretty for an app that should be doing very little).

Come back web 1.0, all is forgiven!

I am going to describe and show 2 techniques that, when used together, give the possibility of reducing this load down to where it should be, tracking the load in the application (I’ve seen 2 orders of magnitude), removing the large overhead.

HTTP is a request/ response protocol; there is no way around this. Anyone who purports to have found a way around this is stretching the truth somewhat. There is however, a technique that can be used to simulate the server making a connection to the browser and pushing information when it knows a change has been made. This neatly solves the latency problem above.

The technique tends to be called long polling, or Comet, and is relatively well-known (see Cometd, Bayeux and others). Imagine an HTTP connection being established, the client sends its request, but the server doesn’t respond, and also doesn’t close the connection. If the timeout is set to a large value (say 1 minute), then it is valid that the server could wait up to 1 minute and then respond. At any point within that minute, the server could learn something that is useful to this particular client, dump the information into the response and send it. The server can be given control of when to respond.

Great! Servers can choose what information flows to clients, and when. We can have sub-second Ajax performance! Imagine the possibilities . . .

There is a problem, however. Try implementing this on a Java Servlet container and you will run out of ideas quite quickly and call Thread.sleep(5900) (or something similar). This blocks the thread, very quickly draining your thread pool. Oh dear, our 2500 concurrent clients with a possible 5 second latency have suddenly become 50 with almost no latency. This may be fine for your application. Let’s assume not.

The second is less of a technique and more of an infrastructure choice. Jetty (version 6) is a relatively well-known Java Servlet container that is mostly used when embedded in one of the larger JEE Application Servers. Standalone, however, it is perfectly capable. It also has a trick up its sleeve when it comes to implementing long polling.

This is its Continuation mechanism, which allows a servicing request thread to be detached from the request it is currently servicing and for the request to be parked (suspended) until the app decides it wants to do something with it. Consider this for a second. Our app had a theoretical maximum of 2500 clients, due to thread contention only. All 2500 clients could connect, their requests would be put to sleep, and then the threads return to the pool. Only if something changed that a particular client needs to know about would it be interacted with. If you app is relatively low on updates, you could support ten thousand of clients, with sub-second latency, from a single server (depends on your app at this point, test, test, test!).

Grails (1.3.0) is my environment of choice, and neatly dovetails into Jetty for development.

Note that the code below will work on tomcat, but will not follow the same scaling line as Jetty, due to a lack of a Continuation mechanism. Instead, when a continuation blocks, it will hold onto the Thread. It appears to be possible to implement a similar system on Tomcat using its Advanced IO/ Comet support, however it would be a bit more work to integrate into Grails.

A version of the Continuation mechanism is in the next version of the servlet spec, and so one would expect it to be implemented across all the servlet containers. The new specification is broadly a superset of the Continuation implementation in Jetty 6.

Now, to code!

Setup

I’m not going to go into detail on many of the steps, due to lack of space, however the Grails docs are extensive and well written.

Firstly, create a new app, remove the tomcat plugin and install the jetty plugin.

Add jetty util to your dependencies (in BuildConfig.groovy)

dependencies {
  runtime 'org.mortbay.jetty:jetty-util:6.1.22'
}

When you call continuation.suspend(), a special exception (RetryRequest) is thrown that is interpreted by the container to suspend the current request. By default, Grails catches all the exceptions bubbling from app code. We prevent this by creating our own Spring ExceptionHandler and wiring it in resources.groovy. This will selectively rethrow and RetryRequest exceptions to inform Jetty we want to suspend our request.

public class RetryRequestExceptionResolver
  extends GrailsExceptionResolver {
  public ModelAndView resolveException(
  HttpServletRequest request, HttpServletResponse response, Object handler, Exception e) {
    if (e instanceof InvokerInvocationException || e instanceof ControllerExecutionException) {
      if (getRootCause(e) instanceof RetryRequest) {
        throw rootException;
      }
      return super.resolveException(request, response, handler, e);
    }
  return super.resolveException(request, response, handler, e);
}
}

And in resources.groovy.

// Place your Spring DSL code here
beans = {
//This adds support to bypass the Spring exception
// handling and
//pass RetryRequest up to the top level.
// exceptionHandler(osj.RetryRequestExceptionResolver)
}

Now, we want a simple example of a request being put to sleep and woken up when something interesting happens. Don’t expect production code!

import org.mortbay.util.ajax.Continuation;
import org.mortbay.util.ajax.ContinuationSupport;
class MessageController {
  //Multiples of these should be saved in a Grails service designed to look after them
  static Continuation continuation
  def index = {
    Continuation cont = ContinuationSupport.getContinuation(request, this);
    if (!cont.isResumed()) {
      continuation = cont
      cont.suspend(60000);
    }
  out << "Resumed at ${new Date()}"
}
def resume = {
  continuation?.resume()
  log.info "Resumed continuation"
  out << "Attempted to resume other request at " + "${new Date()}"
}
}

To test, go to http://localhost:8080/<yourapp>/message. Your browser will block for up to a minute. Open a second window, and visit http://localhost:8080/<yourapp>/message/resume. Your first browser will receive a response with some date / time corresponding to when you hit the resume URL.

Next Steps

This only demonstrates the very basics of the mechanism, there’s a chunk more you would need to do, probably wrap the managing of connections is a Grails service, specifically, logic to decide what clients need to know as soon as it is available, and dispatching straight away (instead of just dropping it in the DB and waiting for the request).

A few things you should think about when implementing the above:-

  • Buffering of messages (clients aren’t guaranteed to be connected)
  • Implementing a message protocol server and client side.Managing/ expiring Continuation instances.
  • OS network connection/ file handle limits (in order to increase to the maximum number of clients you want to support off the server)
  • If you want to have normal ajax polls going on at the same time as the long poll, remember that browsers have a limit on the number of concurrent http connections to a single domain/ port. The spec seems to say 2 (IE follows this), Firefox defaults to 6. You are only really safe with 2 concurrent connections per browser instance.

Ajax is a popular buzzword, and like any popular buzzword it should be investigated critically. It is not a panacea.

Here, however, we’ve seen the beginnings of a useful messaging infrastructure between a server and a browser. Interesting!

This could be integrated into your current infrastructure and extend your routing system all the way to the browser.

It wouldn’t be too hard to envisage creating a Spring Integration Channel or Endpoint implementation, for example, based around the Continuation instances.