Sonntag, 14. September 2014

Load your ActiveRecord objects eagerly!

Eager loading associated ActiveRecord::Base model objects by ActiveRecord::QueryMethods#includes is a common practice for Ruby on Rails developer. Especially when the resulting query is generated dynamically on run time depending on unknown factors.
The nice thing about it: it decides whether it generates one huge SQL join query or multiple small queries for performance reasons. The decision is based on the selection part of the SQL query. If the selection refers to one of the joined tables, the SQL query has to be the huge SQL chunk with all its joins and attributes aliasings, which is a parsing nightmare.
That is why the default approach for includes is preload, with its small and fast querying (read Preload your ActiveRecord objects!). Besides each of the small queries are likely a candidate for query caching and therefore a performance improvement.
For example the original code:
class Category < ActiveRecord::Base
  has_many :foods
end

class Food < ActiveRecord::Base
  belongs_to :category
end
and eager loading (aka preloading) the categories and their associated foods by searching the food name:
@categories = Category.includes(:foods).
  where("foods.name LIKE :term", { term: '%milk%' })
also generates the SQL:
SELECT "categories"."id" AS t0_r0, 
"categories"."name" AS t0_r1, 
"categories"."created_at" AS t0_r2, 
"categories"."updated_at" AS t0_r3, 
"foods"."id" AS t1_r0, 
"foods"."name" AS t1_r1, 
"foods"."created_at" AS t1_r2, 
"foods"."updated_at" AS t1_r3, 
"foods"."category_id" AS t1_r4, 
"foods"."description" AS t1_r5 
FROM "categories" 
LEFT OUTER JOIN "foods" 
  ON "foods"."category_id" = "categories"."id" 
WHERE (foods.name LIKE '%milk%')
Please note that ActiveRecord::QueryMethods#includes generates 1 SQL query joining both tables by a LEFT JOIN and the projection part contains already a lot of aliased attributes.
An easy step to improve the codes intention is to replace includes by eager_load like:
@categories = Category.eager_load(:foods).
  where("foods.name LIKE :term", { term: '%milk%' })
The generated SQL is exactly the same:
SELECT "categories"."id" AS t0_r0, 
"categories"."name" AS t0_r1, 
"categories"."created_at" AS t0_r2, 
"categories"."updated_at" AS t0_r3, 
"foods"."id" AS t1_r0, 
"foods"."name" AS t1_r1, 
"foods"."created_at" AS t1_r2, 
"foods"."updated_at" AS t1_r3, 
"foods"."category_id" AS t1_r4, 
"foods"."description" AS t1_r5 
FROM "categories" 
LEFT OUTER JOIN "foods" 
  ON "foods"."category_id" = "categories"."id" 
WHERE (foods.name LIKE '%milk%')
But if you want to reveal which eager loading approach will happen, you should stick to eager_load, because that makes clear, you kow:
  1. a huge join query will be generated
  2. the query will not be cached
  3. there is a reference of the query selection/ aggregation/ order part to at least 1 join table
Otherwise you point out that you do not know anything about how the ActiveRecord objects are loaded eagerly.
Further articles of interest:

Supported by Ruby 2.1.1 and Ruby on Rails 3.2.17