Untwisting the Event Loop

Labels: , , , , , ,

Have you ever wondered why your Rails application is so memory hungry while it is not really trying to fully utilize your CPUs? To saturate your CPUs you have to have a large number of Thin (or Mongrel or whatever) instances. Why is that? We all know that the Ruby interpreter is not able to utilize more than one CPU (or no more than one CPU at a time in the case of 1.9). But why can't Ruby (or may be it's Rails?) utilize the processors efficiently? Let's look for an answer to this question.

First off, what happens in a typical Rails action? The Rails framework will be doing some request mapping and routing which is mostly CPU (if we consider memory latency negligible) Then a few requests will be sent to the database to retrieve some data after which a rendering process which is mostly CPU as well.
def show
@user = User.find(params[id]) #db access
@events = Events.find(:all) #another db access
render :action => :show #rendering
end

The problem here comes with the database part of the action. Calls to the database will block processing till results get back from the DBMS. During that time, Rails will be frozen and not trying to do any thing else till the call ends. Good news is that threads can help here (even Ruby's green threads). A blocked thread will give way to other threads till it is back in the ready state. Thus filling those slots with some useful processing. Sounds good enough? NO!

Sadly Rails is NOT thread safe. You cannot use threads to do parallel processing in Rails. So why not something like Merb? I hear you say. Well Merb and threads will be able to interleave CPU operations and help with the time spent on IO in something like fetching data from some other service. But it won't save you when you do database IO. Simply because of the simple fact that calling C extensions blocks the whole Ruby interpreter. Yes, you read it correctly the first time. Nothing cannot be scheduled while a native call is being issued. Since database drivers are mostly C extensions they suffer from this. Your nice SELECT statement keeps the whole Ruby interpreter on hold till it is finished.

But there must be a solution to this. We cannot be all left high and dry with interpreters eating our memory and not really using our CPUs.

Enter EventMachine and AsyMy

For those who are not in the loop of events (bun intended) there happens to be another approach to this problem. Event based (read asynchronous) IO. In this mode of operation you request an IO operation and tell the event loop what to do when the request is fulfilled (either fully or partially). An excellent library for event handling exists for Ruby which is Francis' EventMachine (used internally by the Thin server and the evented flavour of Mongrel). But still, using EventMachine does not magically solve all our problems. The question that keeps popping up, what to do with database access? AsyMy to the rescue! AsyMy, written by Thomas Ptacek, is an evented driver for MySQL that operates in an asynchronous fashion. A quick example will look like:
connection.execute('SELECT * from events') do |headers,data|
# do something with headers and data
pp headers
pp data
end
Asymy is still in a very early stage, the performance is horrible (as it is based on the darn slow pure Ruby MySQL driver) and it comes with many rough corners (I was not able to run INSERTs and UPDATEs without hacking it, and I am still not able to run the callbacks for those). Nevertheless, this is a formidable achievement on the road to a very fast single threaded implementation.

Here's how our action would look like if there was an Asymy adapter for ActiveRecord
#this is propably wrong but it can illustrate
#the twisted nature of evented programming
def show
User.find(params[:id]) do |result_set|
@user = result_set
Events.find(:all) do |result_set|
@events = result_set
@events.each do |event|
event.owner = @user
if event != @events.last
event.save
else
event.save do |ev|
render :action => :show
end
end
end
end
end
end
We had to twist the function flow to be able to make use of the evented nature of the new driver. Instead of flow passing normally it is being scattered in the different callbacks. This is one of the areas where event based programming makes you change the way you think about program flow. A hurdle for many developers and a show stopper for some. No wonder the event library for Python is called Twisted

Why not untangle this with Fibers?

Fibers are lightweight concurrency primitives introduced in Ruby 1.9. How light weight? well they don't come at zero cost but in long running requests the weight they add can be negligible. Fibers provide some form of cooperative (rather than preemptive) concurrency inside a single thread (you cannot pass fibers between threads, you have been warned). Fibers enjoy the ability to pause and resume like continuations, but they don't suffer from the memory leaks the continuations have. When we use this feature wisely we can unwind the action code above to look like this:
def show
@user = User.find(params[:id]
@events = Events.find(:all).each do |event|
event.owner = @user
event.save
end
end
Huh? this is the normal action code we are used to. Well, using fibers we can do this and still do things under the hood in an evented way.

To make things clear we need to illustrate Fibers with an example:
require 'fiber'

fiber = Fiber.new do
#do something
Fiber.yield another_thing
#do yet another thing
end

yielded = fiber.resume # => runs the fiber till the yield,
# returns the yielded value
# and pauses the fiber where it is
fiber.resume #=> re-runs the fiber from the point it was paused.
fiber.resume #=> no more statements to run, raises an exception
Let's see how can this be useful for dispatching controller actions (this code will preferrably be in the server itself)
Fiber.new do
Dispatcher.dispatch(controller,action,req,res)
send_response res
end.resume
Inside the action we call the find method repeatedly. This method could be implemented like this:
class DataStore
def find(*args)
query = construct_query(*args)
fiber = Fiber.current # grab the current fiber
conn.execute(query) do |headers, data|
fiber.resume convert_to_objects(data)
end
yield
end
end
This way whenever the code passes a find method it will pass the query to the db driver, return immediately and pause, giving room for other requests to be processed. Once the data comes from the db server the call back is run and it resumes the fiber (passing to it the result of the query). The result gets passed back to the caller of the function and the original action method continues till completion (or till it is paused again by another find method)

Roger Pack has a nice writeup (with actual working code) on the Evented Fibered combo here.

Charles Jolley implemented a similar thing here. It is called Pipelined and while it is more obtrusive than the approach described above, it still has the advantage of being optional. Pipelined uses continuations and hence is available to Ruby 1.8 (and Rails).

I am still ironing out and tying things together (and doing lots of benchmarks) and I would like to tell you that I have ditched AsyMy for now for another alternative which I will attempt to discuss in detail in another blog post.

Comment (1)

Making hero gold
is the old question : Honestly there is no fast way to make lots of hero online gold. Sadly enough a lot of the people that all of a sudden come to with millions of hero online moneyalmost overnight probably duped . Although there are a lot of ways to make lots of hero money here I will tell you all of the ways that I know and what I do to buy hero gold.