Labels: activerecord , neverblock , postgres , rails , ruby , ruby 1.9
They told you it can't be done, they told you it has no scale. They told you lies!
What if you suddenly had the ability to serve mutliple concurrent requests in a single Rails instance? What if you had the ability to multiplex IO operations from a single Rails instance?
No more what ifs. It has been done.
I was testing NeverBlock support for Rails. For testing I built a normal Rails application. Nothing up normal here, you get the whole usual Rails deal, routes, controllers, ActiveRecord models and eRuby templates. I am using the Thin server for serving the application and PostgreSQL as a database server. The only difference is that I was not using the PostgreSQL adapter, rather I was using the NeverBlock::PostgreSQL adapter.
All I needed to do is to call the adapter in database.yml neverblock_postgresql instead of postgresql and require 'never_block/server/thin' in my production.rb
All this was working with Ruby 1.9, so I had to comment out the body of the load_rubygems method in config/boot.rb which is not needed in Ruby1.9 anyway.
Now what difference does this thing make?
It allows you to process multiple requests concurrently from a single Rails instance. It does this by utilizing the async features of the PG client interface coupled with Fibers and the EventMachine to provide transparent async operations.
So, when a Rails action issue any ActiveRecord operation it will be suspended and another Rails action can kick in. The first one will be resumed once PostgreSQL has provided us with the data.
To make a quick test, I created a controller which would use an AR model to issue the following sql command "select sleep(1)". (sleep does not come by default with PostgreSQL, you have to implement it yourself). I ran the applications with the normal postgresql adapter and used apache bench to measure the performance of 10 concurrent requests.
Here are the results:
Almost 1 request per second. Which is what I expected. Now I switched to the new adapter, restarted thin and redid the test.
Here are the new results:
Wow! a 9x speed improvement! The database requests were able to run concurrently and they all came back together.
I decided to simulate various work loads and test the new implementation against the old one. I devised the workloads taking into account that the test machine did have a rather bad IO perfromance so I decided to use queries that would not tax the IO but still would require the PostgreSQL to take it's time. The work loads were categorized as follows:
First a request would issue a "select 1" query, this is the fastest I can think of, then for the differen work loads
I tested those workloads against the following
I tested with 1000 queries and a concurrency of 200 ( the mutliple thin servers were having problems above that figure, the new adapter scaled up to 1000 with no problems, usually with similar or slightly better results )
Here are the graphed results:
For the neverblock thin server I was using a pool of 12 connections. As you can see from the results, In very heavy workload I would perform on par with a 12 Thin cluster. Generally the NeverBlock Thin server easily outperforms the 4 Thin cluster. The margin increases as the work load gets heavier.
And here are the results for scaling the number of concurrent connections for a NeverBlock::Thin server
Traditionally we used to spawn as many thin servers as we can till we run out of memory. Now we don't need to do so, as a single process will maintain multiple connections and would be able to saturate a single cpu core, hence the perfect setup seems to be a single server instance for each processor core.
But to really saturate a CPU one has to do all the IO requests in a non-blocking manner, not just the database. This is exactly the next step after the DB implementation is stable, to enrich NeverBlock with a set of IO libraries that operate in a seemingly blocking way while they are doing all their IO in a totally transparent non-blocking manner, thanks to Fibers.
I am now wondering about the possibilities, the reduced memory footprint gains and what benefits such a solution can bring to the likes of dreamhost and all the Rails hosting companies.
What if you suddenly had the ability to serve mutliple concurrent requests in a single Rails instance? What if you had the ability to multiplex IO operations from a single Rails instance?
No more what ifs. It has been done.
I was testing NeverBlock support for Rails. For testing I built a normal Rails application. Nothing up normal here, you get the whole usual Rails deal, routes, controllers, ActiveRecord models and eRuby templates. I am using the Thin server for serving the application and PostgreSQL as a database server. The only difference is that I was not using the PostgreSQL adapter, rather I was using the NeverBlock::PostgreSQL adapter.
All I needed to do is to call the adapter in database.yml neverblock_postgresql instead of postgresql and require 'never_block/server/thin' in my production.rb
All this was working with Ruby 1.9, so I had to comment out the body of the load_rubygems method in config/boot.rb which is not needed in Ruby1.9 anyway.
Now what difference does this thing make?
It allows you to process multiple requests concurrently from a single Rails instance. It does this by utilizing the async features of the PG client interface coupled with Fibers and the EventMachine to provide transparent async operations.
So, when a Rails action issue any ActiveRecord operation it will be suspended and another Rails action can kick in. The first one will be resumed once PostgreSQL has provided us with the data.
To make a quick test, I created a controller which would use an AR model to issue the following sql command "select sleep(1)". (sleep does not come by default with PostgreSQL, you have to implement it yourself). I ran the applications with the normal postgresql adapter and used apache bench to measure the performance of 10 concurrent requests.
Here are the results:
Server Software: thin
Server Hostname: localhost
Server Port: 3000
Document Path: /forums/sleep/
Document Length: 11 bytes
Concurrency Level: 10
Time taken for tests: 10.248252 seconds
Complete requests: 10
Failed requests: 0
Write errors: 0
Total transferred: 4680 bytes
HTML transferred: 110 bytes
Requests per second: 0.98 [#/sec] (mean)
Time per request: 10248.252 [ms] (mean)
Time per request: 1024.825 [ms] (mean, across all concurrent requests)
Transfer rate: 0.39 [Kbytes/sec] received
Almost 1 request per second. Which is what I expected. Now I switched to the new adapter, restarted thin and redid the test.
Here are the new results:
Server Software: thin
Server Hostname: localhost
Server Port: 3000
Document Path: /forums/sleep/
Document Length: 11 bytes
Concurrency Level: 10
Time taken for tests: 1.75797 seconds
Complete requests: 10
Failed requests: 0
Write errors: 0
Total transferred: 4680 bytes
HTML transferred: 110 bytes
Requests per second: 9.30 [#/sec] (mean)
Time per request: 1075.797 [ms] (mean)
Time per request: 107.580 [ms] (mean, across all concurrent requests)
Transfer rate: 3.72 [Kbytes/sec] received
Wow! a 9x speed improvement! The database requests were able to run concurrently and they all came back together.
I decided to simulate various work loads and test the new implementation against the old one. I devised the workloads taking into account that the test machine did have a rather bad IO perfromance so I decided to use queries that would not tax the IO but still would require the PostgreSQL to take it's time. The work loads were categorized as follows:
First a request would issue a "select 1" query, this is the fastest I can think of, then for the differen work loads
1 - Very light work load, every 200 requests, one "select sleep(1)" would be issued
2 - Light work load, every 100 requests, one "select sleep(1)" would be issued
3 - Moderate work load, every 50 requests, one "select sleep(1)" would be issued
4 - Heavy work load, every 20 requests, one "select sleep(1)" would be issued
5 - Very heavy work load, every 10 requests, one "select sleep(1)" would be issued
I tested those workloads against the following
1 - 1 Thin server, normal postgreSQL Adapter
2 - 2 Thin servers (behind nginx), normal postgreSQL Adapter
3 - 4 Thin servers (behind nginx), normal postgreSQL Adapter
4 - 1 Thin server, neverblock postgreSQL Adapter
I tested with 1000 queries and a concurrency of 200 ( the mutliple thin servers were having problems above that figure, the new adapter scaled up to 1000 with no problems, usually with similar or slightly better results )
Here are the graphed results:
For the neverblock thin server I was using a pool of 12 connections. As you can see from the results, In very heavy workload I would perform on par with a 12 Thin cluster. Generally the NeverBlock Thin server easily outperforms the 4 Thin cluster. The margin increases as the work load gets heavier.
And here are the results for scaling the number of concurrent connections for a NeverBlock::Thin server
Traditionally we used to spawn as many thin servers as we can till we run out of memory. Now we don't need to do so, as a single process will maintain multiple connections and would be able to saturate a single cpu core, hence the perfect setup seems to be a single server instance for each processor core.
But to really saturate a CPU one has to do all the IO requests in a non-blocking manner, not just the database. This is exactly the next step after the DB implementation is stable, to enrich NeverBlock with a set of IO libraries that operate in a seemingly blocking way while they are doing all their IO in a totally transparent non-blocking manner, thanks to Fibers.
I am now wondering about the possibilities, the reduced memory footprint gains and what benefits such a solution can bring to the likes of dreamhost and all the Rails hosting companies.
Great post, thanks for the info. Thanks to you, I'll be looking into this further.
I'm convinced this whole NeverBlock thing is a front to allow you to show off your pretty graphing skills.
Good work.
Thanks for your research. Much appreciated.
Very cool.. can anyone shed light on if this will ever work with mysql? Thank you.
MySQL is already supported, more info on that topic should be available tomorrow :)
How you were able to have multiple rails concurrent processes--did you use config.threadsafe! from EDGE rails?
If so then I guess things will only get better with 2.2 :) ?
A workload of "sleep (1)" is somewhat light--I'd imagine we'd see somewhat less of a speedup for normal rails apps :D
Nice work!
-=R
@Roger, I did not use the threadsafe configuration option. Fibers prevent things like race conditions. You only need to make sure no transient information is stored in static/global variables.
I had to remove the synchronization blocks to get this to work.
hello mhd, this khaled from d1g.com
can i have ur email somehow, this is mine
khellls@gmail.com
Really nice work! Looking forward to 1.9 becoming stable so I could comfortably use this in production.
@aquasync, you can use this with 1.8 now, still not release quality yet but you can test it.
Looks like you missed a 0 in your time for all requests in the Neverblock test. It says 1.75 instead of 1.075, which puts a damper on your calculations.
He may have been referring to req/s
@Jesse, exactly like Roger said, I was referring to req/sec figures.
You write very well.
Dear Muhammed,
I really like the graphs, especially http://www.espace.com.eg/assets/neverblock/images/charts/1.gif and
http://www.espace.com.eg/assets/neverblock/images/charts/2.gif.
What library did you use to generate these? I'd like to use them in my thesis as well.
Kind regards,
Michel