Performance tips for Ruby on rails application
The performance of ROR is influenced by many factors, particularly the configuration of your deployment server(s). However the application code can make a big difference and determine whether your site is slow or highly responsive. Achieving good performance is a tricky business.
some tips on coding around these problem areas below.
Choosing a Session Container
Rails comes with several built in session containers. All applications I have analyzed used either PStore, which stores session information in a separate file on your file system, or ActiveRecordStore, which stores it in the database. Both choices are less than ideal, especially slowing down action cached pages. Two much better alternatives are available: SQLSessionStore and MemCacheStore.
SQLSessionStore avoids the overhead associated with ActiveRecordStore by
- not using transactions (they are not required for correct operation of SQLSessionStore)
- offloading the work of updating “created_at” and “updated_at” to the database
If you use Mysql, you should make sure to use a MyISAM table for sessions. It is faster than InnoDB, and transactions are not required.
MemCacheStore is even faster than SQLSessionStore. Measurements show a 30% speed improvement for action cached pages. You need need to install Eric Hodel’s memcache client library and do some configuration in environment.rb to be able to use it. Warning: do not attempt to use Ruby-Memcache (it’s really, really slow).
For my own projects I tend to use database based session storage, as it enables simple administration through either the Rails command line or administrative tools provided by the database package. You’d need to write your own scripts for MemCacheStore. On the other hand, memchached presumably scales better for very high traffic web sites and comes with Rails supported automated session expiry.
Caching Computations During Request Processing
If you need the same data over and over again, during processing a single request, and can’t use class level caching because your data depends in some way on the request parameters, cache the data to avoid repeated calculations.
The pattern is easily employed:
module M
def get_data_for(request)
@cached_data_for_request ||=
begin
expensive computation depending on request returning data
end
end
end
Your code could be as simple as “A”..”Z”.to_a or it could be a database query, retrieving a specific user, for example.
Optimizing Finder Queries
Rails comes with a powerful domain specific language for defining associations between model classes which reflect table relationships. Alas, the current implementation hasn’t been optimized for performance, yet. Relying on the built in generated accessors can severely hurt performance.
The first part of the problem is usually described as the “1+N” query problem: if you load a N objects from class Post (table “posts”), which has a n-1 relationship to class Author (table “authors”), accessing the author of a given article using the generated accessor methods will cause N additional queries to the database. This, of course, puts some additional load on the database, but more importantly for Rails application server performance, the SQL query statements to be issued will be reconstructed for object accessed.
You can get around this overhead by adding an :include => :author to your query parameters like so:
Posts.find(:all, :conditions => ..., :include => :author)
This will avoid all of the above mentioned overhead by issuing a single SQL statement and constructing the author objects immediately. This technique is commonly called “find with eager associations” and can also be used with other relationship types (such as 1-1, 1-n or n-m).
However, n-1 relationships can be optimized further by using a technique called “piggy backing“: ActiveRecord objects involving joins carry the attributes from the join table(s) along the attributes from the original table. Thus, a single query with a join can be used to fetch all required information from the database. You could replace the query above with
Posts.find(:all, :conditions => ...,
:joins => "LEFT JOIN authors ON posts.author_id=authors.id",
:select => "posts.*, authors.name AS author_name")
assuming that your view will only display the author’s name attached to the article information. If, in addition, your view only displays a subset of the available article columns, say “title”, “author_id” and “created_at”, you should modify the above to
Posts.find(:all, :conditions => ...,
:joins => "LEFT JOIN authors ON posts.author_id=authors.id",
:select => "posts.id, posts.title, posts.created_at, posts.author_id, authors.name AS author_name")
In general, loading only partial objects can be used to speed up queries quite a bit, especially if you have a large number of columns on your model objects. In order to get the full speedup from the technique, you also need to define a method on the model class to access any attributes piggy backed on the query:
class Posts
...
def author_name
@attributes['author_name'] ||= author.name
end
end
Using this pattern relieves you from knowing whether the original query has a join or not, when writing your view code.
If your database supports views, you could define a view containing just the required information and you would get around writing complicated queries manually. This would also get you the correct data conversion for fields retrieved from the join table. As of now, you don’t get these from Rails, but need to code them manually.
- Join is great tool because ROR use LEFT join in query when includes are used. Code part is short but mysql query time is very 2 high. Specially key fields are not indexed.
- Retrieve only the information that you need. A lot of execution time can be wasted by running selects for data that is not really needed. When using the various finders make sure to provide the right options to select only the fields required (:select), and if you only need a numbered subset of records from the resultset, opportunely specify a limit (with the :limit and
ffset options). - Avoid dynamic finders like MyModel.find_by_*. While using something like User.find_by_username is very readable and easy, it also can cost you a lot. In fact, ActiveRecord dynamically generates these methods within method_missing and this can be quite slow. In fact, once the method is defined and invoked, the mapping with the model attribute (username in our example) is ultimately achieved through a select query which is built before being sent to the database. Using MyModel.find_by_sql directly, or even MyModel.find, is much more efficient;
- Be sure to use MyModel.find_by_sql whenever you need to run an optimized SQL query. Needless to say, even if the final SQL statement ends up being the same, find_by_sql is more efficient than the equivalent find (no need to build the actual SQL string from the various option passed to the method). If you are building a plugin that needs to be cross-platform though, verify that the SQL queries will run on all Rails supported databases, or just use find instead. In general, using find is more readable and leads to better maintainable code, so before starting to fill your application with find_by_sql, do some profiling and individuate slow queries which may need to be customized and optimized manually.
Use HTML for your views
A number of helpers in Rails core will run rather slowly. In general, all helpers that take a URL hash will invoke the routing module to generate the shortest URL referencing the underlying controller action. This implies that several routes in the route file need to be examined, which is a costly process, most of the time. Even with a route file as simple as
ActionController::Routing::Routes.draw do |map|
map.connect '', :controller => "welcome"
map.connect ':controller/service.wsdl', :action => 'wsdl'
map.connect ':controller/:action/:id'
end
you will see a big performance difference between writing
link_to "Look here for job #{h @job.title}",
{ :controller => "jobs", :action => "show", :id => @job },
{ :class => "job_link" }
and coding out the tiny piece of HTML directly:
<a href="/jobs/show/<%= @job.id %>"
class="job_link">Look here for job <%= h @job.title %></a>
For pages displaying a large number of links, I have measured speed improvements up to 150% (given everything else has been optimized).
Rotate your logs
If your app is hit often, take time to cleanup your “log” directory from old logs. And you have some shrink-wrapped goodness to do it for you.
rake clear_logs
Patch the GC
Patching Ruby’s Garbage Collection is strongly advised and will improve the speed of your Ruby and Rails applications significantly.
Related Posts
Tags: benchmark, benchmarking tips in rails, finder, mysql, performance, performance-optimization, ROR, ruby on rails
Viewed: 1,959 views

September 8th, 2008 at 1:44 pm
Tweaking the database lets you make huge gains in performance without modifying your complex application.
September 8th, 2008 at 1:50 pm
passing a block to a method of ActiveRecord::Associations::HasManyAssociation instance and its friends chews up the memory.
For example, a single call to association.select { |record| record.new_record? } can allocate up to 10K of memory depending on the association size.
September 9th, 2008 at 9:32 am
Thanks for the post!
In regards to point #2, there’s some built-in helpers for caching in Rails 2.1 which are super helpful: http://railscasts.com/episodes/115-caching-in-rails-2-1
September 10th, 2008 at 11:27 pm
Nothing substitutes for benchmarks. Things that ought to be fast are sometimes slow. Things that ought to be slow are sometimes fast.
But don’t trust the benchmarks. Sometimes I’d return to my machine to find the *same query* running in less than a second instead of over a minute. I was using SQL_NO_CACHE, so I think background optimisation is responsible here. Since the database was very large, it’s possible that OSX’s virtual memory management was also interfering.
April 1st, 2009 at 9:59 pm
I’ve never given this a try, but I think it’s about time I do.
April 24th, 2009 at 4:56 am
I found your blog on Google. I’ve bookmarked it and will watch out for your next blog post.