Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

The case for a nonblocking Ruby stack

12

Labels: , , , , , , , , , ,

In a previous post I talked about the problems that plauge the web based Ruby applications regarding processor and memory use. I proposed using non-blocking IO as a solution to this problem. In a follow up post I benchmarked nonblocking vs blocking performance using the async facilities in the Ruby Postgres driver in combination with Ruby Fibers. The results were very promising (up to 40% improvement) that I decided to take the benchmarking effort one step further. I monkey patched the ruby postgres driver to be fiber aware and was able to integrate it into sequel with little to no effort. Next I used the unicycle monorail server (the EventMachine HTTP server) in an eventmachine loop. I created a dumb controller which would query the db and render the results using the Object#to_json method.

As was done with the evented db access benchmark, a long query ran every n short queries (n belongs to {5, 10, 20, 50, 100}). The running application accepted 2 urls. One ran db operations in normal mode and the other ran in nonblocking mode (every action invocation was wrapped in a fiber in the latter case)

Here are the benchmark results

Full results

Comparing the number of requests/second fulfilled by each combination of blocking mode and conncurrency level. The first had the possible values of [blocking, nonblocking] the second had the possible values of [5, 10, 20, 50, 100]



Advantage Graph

Comparing the advantage gained for nonblocking over blocking mode for different long to short query ratios. Displaying the results for different levels of concurrency



And the full results in tabular form

Concurrent Requests
Ratio 10 100 1000

1 To 100 Nonblocking 456.94 608.67 631.82
1 To 100 Blocking 384.82 524.39 532.26
Advantage 18.74% 16.07% 18.71%

1 To 50 Nonblocking 377.38 460.74 471.89
1 To 50 Blocking 266.63 337.49 339.01
Advantage 41.54% 36.52% 39.20%

1 To 20 Nonblocking 220.44 238.63 266.07
1 To 20 Blocking 142.6 159.7 141.92
Advantage 54.59% 49.42% 87.48%

1 To 10 Nonblocking 130.87 139.76 195.02
1 To 10 Blocking 78.68 84.84 81.07
Advantage 66.33% 64.73% 140.56%

1 To 5 Nonblocking 70.05 75.5 109.34
1 To 5 Blocking 41.48 42.13 41.77
Advantage 68.88% 79.21% 161.77%

Conclusion

In accordance with my expectations. The nonblocking mode outperforms the blocking mode as long as enough long queries come into the play. If all the db queries are very small then the blocking mode will triumph mainly due to the overhead of fibers. But nevertheless, once there is a even single long query for every 100 short queries the performance is swayed into the nonblocking mode favor. There are still a few optimizations to be done, mainly complete the integrations with the EventMachine which should theoritically enhance performance. The next step is to integrate this into some framework and build a real application using this approach. Since Sequel is working now having Ramaze or Merb running in non-blocking mode should be a fairly easy task. Sadly Rails is out of the picture for now as it does not support Ruby 1.9 yet.

I reckon an application that does all its IO in an evented way will need much less processes per CPU core to make full use of it. Actually I am guessing that a single core can be maxed by a single process. If this is the case then I can live much happier if I can replace the 16 Thin processes running on my server with only 4. Couple that with the 30% memory savings we get from using RubyEE and we are talking about an amazing 82.5% memory foot print reduction without sacrificing performance.

Ruby Fibers Vs Ruby Threads

15

Labels: , , , ,

Ruby 1.9 Fibers are touted as lightweight concurrency elements that are much lighter than threads. I have noticed a sizbale impact when I was benchmarking an application that made heavy use of fibers. So I wondered what If I switched to threads instead? After some time fighting with threads I decided I needed to write something specific for this comparison. I have written a small application that would spawn a number of fibers (or threads) and then would return the time went into this operation. I also recorded the VM size after the operation (all created fibers and threads are still reachable, hence, no garbage collection). I did not measure the cost of context switching for both approaches, may be in another time.

Here are the results for creation time:



And the results for memory usage:



Conclusion

Fibers are much faster to create than threads, they eat much less memory too. There is also a limit on the number of threads for 1.9 as I maxed on 3070 threads while fibers were not complaining when I created 100,000 of them (but they took 203 seconds and occuppied a whoping 500MB of RAM).

How much horse power can your app engine deliver?

0

Labels: , , , , ,

I was wondering if the google app engine thing has the horse power to actually deliver the scalability experience that they are promising. I would expect that the app engines would be able to handle high loads and fulfill requests at decent rates.

I decided to benchmark the app engines and see if it can live to the promises that google are making.

For the benchmark I used a simple web page that visits the google data store, retrieves some data from it (small data set, so that the test wouldn't be bandwidth limited) then converts it to JSON (not using any library code for that, ugly hand written string manipulation code).

The web page (actually the currently super useless supergtd app) was written in accordance to the sample provided in the google app engine documentation. My Python skills are so immature but I can tell that everything is pretty straightforward. Route is determined, correct python file loaded and my get method is run. The method internally calls the data store and retrieves data then does some string processing and sends the result back.

I tested from a server at softlayer (this one has the most bandwidth and the least latency of all the servers that I have access too, and it's a dual quad core monster for the records)

Here are the results from the benchmark (click for a bigger version):

Frankly, I am very satisfied by the results. The app engine was capable of handling any load I threw at it. Even though the request I am testing is kinda simplistic it manages to pass through enough different parts of the stack to make it relevant. Many APIs will have resources that are only slightly more complex than the sample resource I created for this benchmark.

Bottom line, top notch performance. And you don't even need to bother with your Elastic Computing Cloud configuration. You don't manage instances or anything (or even pay by them) you only pay per usage which is a much sweeter deal if you ask me (if your app does not require that you cross the sandbox boundaries that is). Oh and before I forget, serving static content was pretty fast too. Not breathtakingly fast but very decent to say the least (you should never skip proper client caching though, at least to save the bandwidth)

If any one at Google is reading this I say "Great job, big hand for the big G". Now show me some Ruby love (and JavaScript too for what it's worth).

Did I mention that I need some Ruby love? Please.

Thin is getting thinner by the day

1

Labels: , , , , ,

It seems that macournoyer is in a frenzy of releases for Thin!. He just released the 0.6.1 sweet cheesecake release (what's up with the release names? I'm trying to control my diet here!). The new release uses less memory than Mongrel which is a very good thing IMHO. It also uses the same config files used for Mongrel Cluster. The big bang is the ability to use UNIX sockets to connect to the load balancer instead of TCP connections. This decreases the overhead of nginx sending requests and getting responses back from the upstream servers.

These are good news, though I will have to test if this will translate to noticeable performance increase in a full fledged Ruby on Rails application

I will redo my tests with this new setup and come back with more numbers

Stay tuned