Pages

Thursday, December 26, 2013

Counter Cache in Rails - Performance Tuning

If you need to display the record count for a has_many association, you can improve performance by caching that number in a column.
Below we have an application that shows a list of projects along with the number of tasks that each project has.

Our ProjectsController and the index view are shown below.
ruby
class ProjectsController < ApplicationController
  def index
    @projects = Project.find(:all)
  end
end
The ProjectsController.
ruby
<h1>Projects</h1>
<ol>
  <% @projects.each do |project| %>
  <li><%= link_to project.name, project_path(project) %> (<%= pluralize project.tasks.size, &rsquo;task&rsquo; %>)</li>
  <% end %>
</ol>The index view.
In the view we’re looping through each Project, showing the project’s name and the number of tasks (withprojects.tasks.size) it has. We’re also using the pluralize method so that either "task" or "tasks" will be shown depending on whether there is one or more tasks.

Improving The Database Access

Let’s look at the development log to see how the database is being accessed when we view the index page.
terminal
Rendering projects/index
  SQL (0.3ms)   SELECT count(*) AS count_all FROM "tasks" WHERE ("tasks".project_id = 61) 
  SQL (0.2ms)   SELECT count(*) AS count_all FROM "tasks" WHERE ("tasks".project_id = 62) 
  SQL (0.3ms)   SELECT count(*) AS count_all FROM "tasks" WHERE ("tasks".project_id = 63) 
  SQL (0.2ms)   SELECT count(*) AS count_all FROM "tasks" WHERE ("tasks".project_id = 64) 
  SQL (0.2ms)   SELECT count(*) AS count_all FROM "tasks" WHERE ("tasks".project_id = 65)
A call to the database is being made for each project in the list, getting a count of the tasks each project has. How can we reduce the number of queries made? One way would be to use eager loading, as we showed in the previous episode. We do this by modifying the ProjectsController so that it gets the tasks along with the projects.
ruby
@projects = Project.find(:all, :include => :tasks)
Now, when we refresh the page we see that the number of database requests has fallen to two.
terminal
Processing ProjectsController#index (for 127.0.0.1 at 2009-01-26 21:24:28) [GET]
  Project Load (1.1ms)   SELECT * FROM "projects" 
  Task Load (7.1ms)   SELECT "tasks".* FROM "tasks" WHERE ("tasks".project_id IN (61,62,63,64,65))
This is an improvement but we’re getting all of the information for the relevant tasks when all we want is a count for the tasks in each project. Instead of the eager loading we’re going to use a counter cache column.

Implementing a Counter Cache Column

The first thing to do to implement our counter cache is to add a column to the Projects table that stores the number of Tasks associated with that Project. We’ll generate the new column with a migration.
ruby
script/generate migration add_tasks_count
Our migration file looks like this. We’ll explain it below.
ruby
class AddTasksCount < ActiveRecord::Migration
  def self.up
    add_column :projects, :tasks_count, :integer, :default => 0
    
    Project.reset_column_information
    Project.all.each do |p|
      p.update_attribute :tasks_count, p.tasks.length
    end
  end

  def self.down
    remove_column :projects, :tasks_count
  end
end
The name we’ve given our new column is important. It needs to be the name of the model we want to count, followed by_count. The default value is important too. If it is not zero then the count won’t work correctly. After we’ve created our new column we need to set the value of the count column in each current project. To do this we loop through each project and set its tasks_count attribute to the number of tasks the project has. We use length rather than sizeto get the number of tasks as size would use the counter cache column. which would have its default value of zero.
As we’re modifying a Project in the same migration in which we add a column to it there’s a possibility that the column information may be cached. It’s a good practice to make sure that it’s reset and we do this withProject.reset_column_information.

Has it worked?

Now that we’ve added the column we’ll remove the eager loading from the ProjectsController and then reload the page.
terminal
Processing ProjectsController#index (for 127.0.0.1 at 2009-01-26 22:07:13) [GET]
  Project Load (0.7ms)   SELECT * FROM "projects"
There is only one database call now. as there’s no need to get any information at all from the Tasks table. The number of tasks for each Project now comes from the Project’s tasks_count column.

One last step.

There is one last step to do. If we add a new task to a project then the counter column will not be updated as we’ve not told Rails to use the tasks_count column as a counter cache column. We do that by updating our Task model.
ruby
class Task < ActiveRecord::Base
  belongs_to :project, :counter_cache => true
  has_many :comments
end
We tell Rails to use the counter cache column by adding :counter_cache => true to the tasks association withProject. Now we’ll use the console to add a new task to a project.
terminal
>> p = Project.first
=> #<Project id: 61, name: "Project 1", created_at: "2009-01-26 20:34:36", updated_at: "2009-01-26 22:05:22", tasks_count: 20>
>> p.tasks.create(:name => "New task")
=> #<Task id: 1201, name: "New task", project_id: 61, created_at: "2009-01-26 22:24:13", updated_at: "2009-01-26 22:24:13">
Adding a new task via the console.
Now we’ll refresh the page and see if the counter column is updated.

The project’s counter cache has been updated.

Our first project now has 21 tasks and we’re still only accessing the projects table in our database.

Have fun!!

No comments:

Post a Comment