home mail me! RSS (2.0) feed

Web server performance shoot out – simple pages

There are some new hot web server frameworks: Ruby on Rails (Ruby), Yaws+ErlyWeb (Erlang) and HAppS (Haskell.)

These new frameworks are supposed to facilitate fast development. But, how fast – and scalable – are the applications built in and for these frameworks?

The goal of this post is to get a preliminary answer to this question. NOTE: before you scream “it depends on the problem and any such comparison must include real code not to be a silly toy experiment” let me finish this post :-) I know my tests are flawed, but they do give an indication of relative performance and scalability. With the latter, I mean on one server.

What will be measured is the accepting of an incoming connection, and either delivering a static page or dynamically creating it. All is – of course – using the HTTP protocol. Important: there will be no model objects involved; this implies no access to database or other persistent store..

To counterpoint these new and cool frameworks, we add an old stable contestant to the mix: Apache Web Server. What the heck! We even throw in lighttpd for the fun of it.

If you enjoy minutiae about the exact versions and environment used for this analysis, go the end of the post.

Let the fun begin!

I use ab (ApacheBench) to measure the time for each request-response cycle, including the loopback socket communication.

The command executed is
ab -n 1000 -c 100 http://localhost: /list/
which runs 100 requests in parallel in ten steps, yielding 1000 requests in total. What I measure is number of requests that can be handled in one second. Since this is using the loopback address and very simple pages, this can be viewed as the upper limit for actual use.

NOW your favorite tool’s performance will be judged! Can you take the truth? Ok, here it comes:
WebPerf1

UPDATE 12/10/07:for some paraphenomenal reason, Yariv run almost identical tests the very same day! Although he did only focus on ErlyWeb vs Rails. Anyway, quite similar results to mine: look for yourself.

Remarks

  • Well, for static content, lighttpd is the clear winner, and for dynamic content, it is almost a draw between the two of the four – IMHO – coolest functional languages right now (the other two being F# and Scala), with a slight advantage to the Haskell framework. This is pretty interesting since it is in very early development; ok, admittedly, so is ErlyWeb, but Erlang is not ;-)
  • The currently most popular framework amongst the kids, Ruby on Rail, is sloooow, just handling some 5% of the requests of the functional language frameworks; and this is on one server! I do not want to imagine the difference once a few servers are added :-)
  • What about Java (J2EE, Spring and/or Tomcat JSPs) and ASP.NET!!?? Well, I will include them later…
  • Both RoR and ErlyWeb used a rather intricate dispatching mechanism; well, you can clearly see who is the performer in that battle!
  • ErlyWeb had a hard time handling 1000 requests coming within an interval of a few seconds, so I had to lower the burst to 500 requests there. It did not have as a severe problem if one bypassed the dispatch and used raw Yaws.
  • Details

    The test box is a MacBook Pro, having a dual core CPU running at some 2.x GHz and ample RAM – enough not to have any disk swapping.
    UPDATE 12/10/07:I am using the same box for the (light) client. I do not think that affects relative, but could affect the absolute measures slightly. I did that not to care about transfer of bytes over the wire. Ok, I admit it, I also did it not to have to setup an Amazon EC2 box for this sole purpose ;-)

    The frameworks are:

    • LAMP – Apache 2.2.6 + PHP 5.2.5
    • RoR – Ruby 1.8.6 + Rails 1.2.6 + Mongrel 1.1.1
    • ErlyWeb – Erlang 5.5.5 + Yaws 1.68 + ErlyWeb 0.6
    • HAppS – GHC 6.8.1 + HAppS 0.9.1a
    • lighttpd – lighttpd 1.4.18

kworthington said,

December 10, 2007 @ 4:42 pm

Why didn’t you do a test of dynamic pages/second on lighttpd?

Also, running ab (Apache bench) from the server you want to test is
really inaccurate since the processor has to work harder to serve the
pages and run the test.

Can you re-test with ab from another system? (and possibly include
dynamic pages on lighttpd?)

Regards,
Kevin

tom said,

December 10, 2007 @ 4:52 pm

As was also commented on Yariv’s benchmarks post, your rails configuration is probably not right.

A) You need to make sure you are running in production mode. (mongrel_rails -d -e production)

B) You need to run multiple mongrels load balanced behind something like apache, nginx, lighttpd, whatever. Each mongrel is doing serial requests, Rails itself does no threading. This makes a massive difference.

Between these two changes, Yariv updated his Rails figures from 112.8 R/sec, to 699.8 R/sec. Pretty significant. This approach is also why Rails is kind of a memory hog.

I develop in RoR at my day job (for the last 1.5 years) and build interesting little web apps using Erlang + MochiWeb (but not Yaws or ErlyWeb) in my spare time (been using Erlang actively for about 6 months, only recently using it for web stuff).

There are many thing I love about Erlang, performance among them, but it is also not even remotely comparable in expressiveness and ease of development to Rails. I’ve only briefly looked at ErlyWeb and it’s not even in the same category. I don’t think there’s any way to even make a language like Erlang look quite so friendly for casual web development. It’s just simply not as malleable a language as Ruby.

As far as scalability, RAM is cheap. The share nothing approach of Rails, and having many seperate processes is actually pretty similar to the Erlang approach. It just consumes a lot more resources. If you are at a company like I’m at now, you really do just buy more server slices whenever you feel like you need to serve more requests. Performance and scalability are different things.

Had to give a little defense of Rails which has many things to recommend it (and gets misunderstood a lot, payback for the all the hype I suppose) , even though my language of choice these days is Erlang.

davber said,

December 10, 2007 @ 5:46 pm

Tom:

(A) I did use production enviromnent, and (B) this test was supposed to measure performance on one box; I agree that an even more interesting test would be to have a cluster of nodes.
Defending Yariv a bit, ErlyWeb is definitely more amenable to “casual web developmet” than the other functional web framework I tested: HAppS. Actually, I think I could get a not-too-senior(=”old”) team to build something with ErlyWeb :-)

Re RAM, RAM is not cheap above between 2 and 4 GB; it often involves changing platform…

I agree that RoR rocks when it comes to fluidity of development – it just feels easy and fun – but the very late dynamic binding makes for almost all problems to popup late, instead of during compilation. And, as this performance test indicates, it is simply in a different league performance-wise than ErlyWeb.

davber said,

December 10, 2007 @ 5:53 pm

Kevin:

Why not separate boxes? Well, of course one should do that to have more exact figures, BUT:

1. The impact of – via ApacheBench – connecting, receiving and discarding some 100 bytes per request should not be THAT big, i.e., one should be able to get reasonable absolute metrics.

2. More importantly, the impact should be even across the platforms, i.e., it is the same light client, so relative performances should be extractable. In fact, the client overhead should be linear with a very small constant, so the ratios should remain the same; as long as one is below the threshold of open sockets/files/whataver, which the 100 simultaenous requests should be. I am open for further investigation into this, though.

3. I have a dual core box, and I explicitly allocated the client on “the other core”; ok, I am lying, I do not know how good OS X is here.

A remark there: I did nothing to the frameworks to use multiple cores.

tom said,

December 10, 2007 @ 7:07 pm

You still need to run multiple instances of the mongrel regardless of having only one box. Rails is designed that way, I know it sounds a little weird. It is handling exactly one request at a time and with all the asynchronous stuff like io that’s involved, that means that you *have* to have parallelism from either threads or processes to handle a lot of requests. Rails doesn’t do threads, you get parallelism by running many processes. On each of our boxes, we run as many mongrels as RAM will allow. Tuning a machine running Rails actually involves picking a good number of mongrels to have running behind the real web server.

It is absolutely wrong to compare a single mongrel to a framework that is doing its own concurrency via threads, the numbers don’t mean anything. Note that in Yariv’s benchmark, the main performance difference was not from running in production mode. The difference was from running many mongrels on one box. This is not optional.

Curt Sampson <cjs@cynic.net> said,

December 10, 2007 @ 7:25 pm

I’ve found that apbench, because it waits for a response to every request, has great difficulty pushing a web server to the point where it will fall over, which may come surprisingly early when the simultaneous requests keep coming in. I suggest having a look at httperf.

tom said,

December 10, 2007 @ 7:34 pm

Oh, and one more thing about the price of RAM. The great thing about horizontal scalability is that you can just add more boxes. Scaling up is expensive, scaling out is cheap. Don’t buy more expensive RAM to add to a monolithic server, add more cheap application servers.

That is the point of the process oriented approach. Since each Rails process has no expectation of being able to communicate with each other except via connection to the db, it doesn’t matter what machine you run those processes on.

I think ErlyWeb being more amenable to casual web development than HAppS says more about HAppS than about RoR. I really respect Yariv’s work. Though I don’t actively use ErlyWeb, I have found his blog and code immensely helpful, but I do not find ErlyWeb even remotely comparable to what Rails makes available. It is awesome that it makes things easier, but it’s not the same, sorry. Erlang has other strengths.

Also, if you dislike dynamic/late binding, why are you giving Erlang a pass? Seeing as it is in the same position? I personally like it, if you are getting good code coverage out of of your tests (and you are writing tests aren’t you?), you shouldn’t run into problems.

I should just be making my own blog post for as long these comments are :)

tom said,

December 10, 2007 @ 8:30 pm

Can’t resist one last comment on this. A well set up benchmarking test involving Ruby vs Merb (a closely related Ruby framework): http://www.webficient.com/2007/08/testing-various-configurations-of-rails.html

The discrepency between your R/sec numbers and theirs are quite large. Also note the big changes in numbers depending on whether session are being used and what the session store is. Do you have sessions turned off?

My final beef with this entire analysis, is that using mongrel to serve static pages is bizarre. It is great when you are in development because you don’t have to set anything else up, but in any real world situation, only your dynamic requests will go through mongrel. No attempt has been made to optimize it as a dynamic web server.

I take the same approach when building erlang web apps. Nginx is vastly faster than Yaws for static content, so I’m not really interested in using it as a general purpose web server. Just serve static content from Nginx and proxy dynamic content to the erlang app, problem solved. This is also how our Rails apps are configured.

davber said,

December 10, 2007 @ 9:29 pm

Kevin:

I skipped using dynamic pages with lightttpd since I only threw it in there to compare the delivery of static content from these web frameworks to what I assumed – and rightly so – to be the fastest such server.

I could – and will for my next version, where I *will* include .NET and Tomcat and/or JBoss and/or something Springish – include FastCGI to PHP from lighttpd, since I know that is a reasonably common web server environment.

One interesting note, when we talk about static content, is that both ErlyWeb and HAppS is quite considerably faster in generating dynamic content than static one. If that comes as a surprise, it should not, since the static content means reading a text file from disk.

davber said,

December 10, 2007 @ 9:54 pm

Curt:

I will try httperf out. Thanks for the pointer!

Tom:

Never say “the last comment.” You know there will be more ;-) Regarding using Mongrel for static content – yep, that is a bit crazy, and I would *not* do it myself – *but* I wanted to see how these frameworks served static pages as well. I welcome any suggestion of a complete and realistic combination. Thanks for the Nginx pointer!

But, the fact still holds that most Rails apps *do* use Mongrel for dynamic stuff, right? If so, that bar is still valid. And, honestly, most Rails apps do serve static stuff only in the beginning of sessions (images…) right, so that bar is not that important anyway, or?

Regarding the late binding. Well, the fact is that – in Erlang – the binding is not that late at all. It is all bound up with – a static check – at compile time. It is just that the “static types” available during compilation are more like “kinds” in Haskell, i.e., they represent meta types when viewed from a meta interpretation point of view. What I meant with those convoluted words was that “no, you are not right, Erlang *does* indeed check for existence of functions, arities and stuff” before runtime whereas Ruby does not.

davber said,

December 10, 2007 @ 10:12 pm

disclaimer: I do like Ruby (on or off Rails) and Erlang :-) It is just that I think there is something to gain in telling others – and computer – what your intention of use is for various variables and constructs, which is more doable in a statically typed language. In my extremely humble opinion, the only time dynamic types – which is *distinct* (although not completely orthogonal…) from late binding of constructs – is qualitatively beneficial is when hard core run-time polymorphism is used; note that this latter methodology can also be referred to as “not knowing what one does and/or not having really decided on what these thingies passed around are yet” ;-)

Ok, I admit to enjoy the lenient nature of dynamic(ally typed) languages at times, so take my criticism above with a grain of salt.

davber said,

December 10, 2007 @ 10:16 pm

Ok, this is my last Static/Dynamic flame war post – in this thread:

I really dislike the classification dynamic for dynamically typed languages. It feels like they are, well, more dynamic which is simply not true, in general. For instance, the generic programming notions of templates in C++ and classes in Haskell make for quite dynamic programming and solutions. If you add the meta facilities induced by those C++ templates (and advanced use thereof) or the true compilation-phase execution model of Template Haskell, you get highly dynamic languages.

tom said,

December 11, 2007 @ 12:04 am

Erlang actually does late binding. There are really two types of function references, local and external. External funs, outside of the current module, are late bound. The distinction between the two becomes very important when dealing with code swapping. I’m trying to find a online reference that explains this better than I, but am coming up short. It’s explained well in Joe Armstrong’s book.

Also, since the arguments are untyped, just checking arity and existance is pretty weak compile time checking. There is, however, a tool called the dialyzer that does static analysis getting much of the error detection benefit of a language that actually is statically typed.

Rails deployments do typically use mongrel these days, but you do have to be running a bunch of them as separate processes for the reasons previously discussed. I just think it’s important from looking at the relative advantages of the systems to realize that even though you have all these heavy mongrels running around doing serial processing of the dynamic requests, they aren’t going to be interfering with the serving of your static content.

Also, this is a big deal when thinking about caching. By having parameters in the path, as opposed to the query string, you can cache the page to disk and hand it off to the static web server. In which case, it doesn’t even go near the mongrel until you invalidate the cache. It is important to realize that for many common caching situations, mongrel will be out of the loop entirely.

I actually agree on the static/dynamic naming issue confusing the hell out of people. There is also a strong tendency to confuse dynamic and weak typing. Ruby, for instance, is strongly typed in a way that C/C++/Java are not, since there is no such thing as casting. I don’t really have a preference. My last professional incarnation was doing console game development in C++ which obviously places a radically different priority on questions of brute efficiency. I will say that I find myself very productive in Ruby and Erlang but do not attribute that much to the type system.

Erlang-China » about "Ror vs ErlyWeb perfomance" said,

December 11, 2007 @ 12:25 am

[...] 首先 Erlyweb 尚待优化的地方还有很多。比如[这里]提到: ErlyWeb had a hard time handling 1000 requests coming within an interval of a few seconds, so I had to lower the burst to 500 requests there. It did not have as a severe problem if one bypassed the dispatch and used raw Yaws. [...]

davber said,

December 11, 2007 @ 1:45 am

Tom:

“I try to get out and they keep dragging me back in!” ;-)

I was primarily thinking about intramodular references when I wrote about not-as-late-binding in Erlang (cf Ruby), so at least one gets some help at compile-time ;-) Yes, you are of course right, and that should be stated explicitly for people seeking the Holy Grail of problem solving: binding is deferred between modules in Erlang!

And, if one wants really late binding, one can use reflection, where even signature is decided after process startup…

Regarding mulitple Mongrel instances on one box, I promise you to use that as one of the framework environments in next revision of this test.

Ok, one discussion I will not have (here and now) is that of implicit type conversion, user-defined or not, and arithmetic propagation constituting a weaker type system ;-)

RSS feed for comments on this post · TrackBack URI

Leave a Comment

You must be logged in to post a comment.