November 25th, 2009
Why write this post in the first place?
To showoff your cool stuff?
To deride the other perfectly good software?
No, the only reason I am writing this post is to back up our statement that WebROaR is generally much faster than all other comparable deployment stacks. The aim is to provide real world data and insights in to our benchmarking procedure for everyone. Honestly, there is not much fun in performing this long experiment, as we programmers do have very short attention spans. (Oh wait, let me see HN/programming reddit once again ...).
The results of this comparison should be just taken as an indicator and should act only as one of the many data points in coming up with your own conclusion about Rails deployment stacks. If you are seriously interested to know the performance numbers of your application, please do try this experiment at home. :-)
I can't put across the basics more 'eloquently' than this Zed Shaw essay and would recommend giving it a read. There are no 'eye-opening', 'lift the darkness' kind of insights in that essay, but at times we do make obvious blind mistakes while benchmarking that can be easily avoided. I hope we haven't made many of those in our experiment.
We tried taking care of the following aspects for this experiment:
Ensure the rest of the environment is exactly the same (as much as possible) for all deployment stacks being tested i.e. same hardware, same software versions, same test application and same tester as well. :-)
Use the right benchmarking tool that provides relevant and useful statistics.
Do not make it a "Hello world" application comparison that is potentially not useful in any which way in real world.
Allow each server stack to initialize/warm-up as required before running a performance test over it.
We selected httperf for conducting this experiment for it's ability to provide the relevant statistics. The data presented below should vouch for its usefulness.
I would also recommend this PeepCode screencast that provides a very good introduction to benchmarking with httperf. (Disclaimer - It's not free and we don't get any affiliate money referring it :-) )
CPU - Intel Core 2 Duo 2.8 GHz
RAM - 1GB
OS - Ubuntu 8.1(Intrepid) Desktop
Kernel - Linux 2.6.27-7-generic
Ruby - 1.8.7(2008-08-11 patchlevel 72) (Ubuntu Package)
CPU - Intel P-IV 2.66GHz
Memory - 1 GB
OS - Debian 5.0 (lenny)
Kernel - Linux 2.6.26-1-686
httperf -0.9.0 compiled without DEBUG and without TIME_SYSCALLS
100 MBPS LAN (1 hop between two machines)
Rails Version: 2.3.4
Database: SQLite 3
We have tried to use a URL that has database interaction so as to simulate a typical scenario of a web application. This would ensure most parts of the Rails stack are involved in the test. Please note that caching is not enabled at all for this URL.
We chose the following servers for this comparison.
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + mod-proxy-balancer + Mongrel 1.1.5 Cluster (6)
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + Passenger 2.2.7 (MaxPoolSize = 6)
Apache 2.2.9 (mpm-prefork) (Ubuntu Default) + mod-proxy-balancer + Thin 1.2.5 Cluster (6)
WebROaR v0.2.3 (Set 6 maximum workers for the deployed application)
Since the focus of our test is to compare maximum requests/sec output by each of the stacks, and the specific request is being served out by Rails from the database, replacing Apache with Nginx should not give us too dramatically different results. (As I mentioned earlier, these tests are just indicative of what performance level you can generally expect from your server.)
We would be happy if someone can repeat the test with Nginx as well.
We would like to find out the maximum requests/sec output by each server when the selected application URL is bombarded with requests from httperf.
Consider this command:
httperf --hog --server ABC --num-conns 2000 --num-calls 10 --uri /wiki/list --rate 11
It tries to make 2000 connections to the server ABC and sends 10 requests through each, for the URL /wiki/list. The total number of requests made are 2000*10 = 20,000. The 'Demanded Request Rate' is the product of rate and num-calls parameters i.e. 11 * 10 = 110 requests/sec.
The output would look similar to the following if the server is able to handle this load:
httperf --hog --client=0/1 --server=ABC --port=80 --uri=/wiki/list --rate=11 --send-buffer=4096 --recv-buffer=16384 --num-conns=2000 --num-calls=10 Maximum connect burst length: 1 Total: connections 2000 requests 20000 replies 20000 test-duration 181.970 s Connection rate: 11.0 conn/s (91.0 ms/conn, <=13 concurrent connections) Connection time [ms]: min 107.4 avg 321.2 max 1724.8 median 238.5 stddev 228.1 Connection time [ms]: connect 0.2 Connection length [replies/conn]: 10.000 Request rate: 109.9 req/s (9.1 ms/req) Request size [B]: 75.0 Reply rate [replies/s]: min 97.2 avg 109.9 max 117.4 stddev 4.0 (36 samples) Reply time [ms]: response 31.8 transfer 0.3 Reply size [B]: header 465.0 content 4930.0 footer 0.0 (total 5395.0) Reply status: 1xx=0 2xx=20000 3xx=0 4xx=0 5xx=0 CPU time [s]: user 29.39 system 151.45 (user 16.2% system 83.2% total 99.4%) Net I/O: 587.1 KB/s (4.8*10^6 bps) Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
The important fields to look from the above result are:
Reply rate [replies/s]: min 97.2 avg 109.9 max 117.4 stddev 4.0 (36 samples)
This tells us that average reply rate was 109.9 RPS with a standard deviation of 4. This was measured over 36 samples by httperf that samples every 5 seconds. (This specific test ran for 181.970 seconds). For the statistically inclined, Avg RPS +- 2*Std Deviation should give you 95% of the values.
There were no errors reported.
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0 Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
From this test we can conclude that the particular server under test can easily handle the demand rate of 110 RPS without any errors, and can be subject to more load.
Our aim is to find the maximum avg reply rate output by each server when subject to a series of high demanded request rates. After a specific demanded request rate test has been done, the server is restarted and warmed up before running next higher demanded rate test.
As we keep on increasing the demanded rate, the server would get saturated and its reply rate would not increase, and rather slowly start to degrade after a point. It could also start throwing errors.
We would like to find the saturation point for each server and the maximum avg reply rate output at that point.
All possible additional applications/daemons should be stopped.
Rails environment should be set to production.
At any given point of time only one server deployment should be active on the machine, and it should run only the rails application under test.
Set Analytics to 'Disabled' for the deployed rails application on WebROaR. To ensure a fair comparison, we recommend turning off this feature to get the fastest performance for your application on WebROaR.
Ensure each server is initialized and warmed-up properly for every test i.e. it has loaded all it's resources and in case of Passenger & WebROaR they have already instantiated it's maximum number of workers. We used the following command for warming up all server stacks before running each of their tests.
httperf --hog --server ABC --num-conns 200 --num-calls 10 --uri /wiki/list --rate 20
After each test, the server should be shut down, and tmp folder, log files and sessions should be cleared.The commands that can be used are rake tmp:clear, rake db:sessions:clear and rake log:clear.
To get the memory usage numbers for each server, we used this simple technique that I first saw on the benchmarking blog of the good folks at Phusion Passenger. Essentially, just after the load test use the
free -mcommand to see the free memory in the system, and then check it again after shutting down the server. The difference of these 2 numbers gives us the actual amount of memory the server stack was consuming. E.g. If
free -mreported 300 MB free after the test, and 550 MB when the server was shutdown, the total memory usage of the server was 550-300 = 250 MB.
- For this particular test, WebROaR is able to handle the maximum demanded request rate among all the deployment stacks. Even after it gets saturated its performance doesn't degrade and there are no errors.
- Passenger & Thin started giving errors, hence the graph software automatically got their reply rates to zero which is not statistically true, but emphasizes the point in a way.
- Mongrel also behaves nicely after saturation and doesn't throw any errors.
- Please note that Thin is able to handle the load very well before it breaks out at around 178. Its green line is merged with WebROaR's till that point and not visible in the image above.
The maximum RPS numbers for each of the deployment stack are:
- Apache 2.2.9 + Passenger 2.2.7 (6) - 107.3
- Apache 2.2.9 + Mongrel 1.1.5 Cluster (6) - 123.8
- Apache 2.2.9 + Thin 1.2.5 Cluster (6) - 178.4
- WebROaR 0.2.3 (6 maximum workers) -188.8
As per this test, on an average WebROaR is
~76% faster than Passenger
~52% faster than Mongrel
~6% faster than Thin
Remember, this is just one particular action of one particular application. It is not wise to derive a conclusion for each and every application based on the above result.
Let's look in to more details given to us by httperf.
The above table has the detailed numbers for test run of each deployment stack where it performed at its best. (Avg RPS +- 2*SD) range gives us a good idea of the performance of the server for 95% of its samples. WebROaR has the smallest standard deviation of 1.5 giving us a good indicator of consistency in its performance across all requests.
Also looking at the above table, we can safely say that for this test Network I/O was not a bottle neck at all.
Memory usage numbers for each of the deployment stack when they are delivering their maximum RPS are:
- Apache 2.2.9 + Passenger 2.2.7 (6) - 133 MB
- Apache 2.2.9 + Mongrel 1.1.5 Cluster (6) - 368 MB
- Apache 2.2.9 + Thin 1.2.5 Cluster (6) - 231 MB
- WebROaR 0.2.3 (6 maximum workers) -258 MB
For this test, Passenger consumed the least amount of memory followed by Thin.
More Performance Testing
We picked up 2 more open source Rails applications and tested them out using the above procedure.
Rails Version: 2.3.4
Database: MySQL 5.0.67-0ubuntu6
Name: El Dorado
Rails Version: 2.3.3
Database: MySQL 5.0.67-0ubuntu6
Here are the results:
As per the tests, on an average WebROaR is
15 to 36% faster for the tested Redmine URL
9 to 39% faster for the tested El Dorado URL
than other servers.
The above memory usage graph indicates Passenger & WebROaR seem to be less memory hungry than the other 2 servers. (At least for these applications!)
More Performance Testing (with a different tool)
We also tested out the 3 applications (Instiki, Redmine and El Dorado) with 'ab'. In our opinion, 'httperf' is more reliable and provides more useful information, but we thought it would be a good idea to cross-check if 'ab' is giving similar (if not exact) results.
The following command was used to 'warmup' the server stacks:
ab -c20 -n2000 [Application URL]
The actual test:
ab -c20 -n20000 [Application URL]
The tests with 'ab' also confirmed the trends seen with 'httperf' and WebROaR does seem to be doing better than the other servers.
The memory usage numbers again corroborated the earlier finding of Passenger and WebROaR being less memory hungry than other 2 servers.
The complete set of raw results data can be downloaded from here.
How much should one really read in to the numbers above? Does WebROaR beat all other deployment stacks every time by these margins?
Well, we would suggest taking these numbers as indicative of how your application might perform on WebROaR. You may or may not see these exact gains (could be higher or lower), but you can safely assume that mostly there would be some decent gain performance of your application if it runs on WebROaR.
Apart from performance, we believe WebROaR brings a whole lot of simplicity and an integrated solution for Ruby on Rails™ application deployment. Do check it out and run your application on it when you have some time!
We would be happy to receive your feedback/comments/suggestions on this article.