Squashing N+1 queries in Rails
ActiveRecord
is a thing of beauty. It lowers the barrier to entry significantly by abstracting the database layer behind an interface that is easier for new developers to learn. All abstractions are some sort of trade-off, often betwwen usability and explicitness.
One of the sacrifices we make when using an ORM like ActiveRecord
is that the underlying SQL queries become hidden from us, and it takes some time before you can "see the forest through the trees".
Building a blogging platform
For example, lets say we're building a platform like Medium or substack. Our Article
model has many Comment
s, and both resources are tied to a User
.
class Article < ApplicationRecord
belongs_to :user
has_many :comments
end
class Comment < ApplicationRecord
belongs_to :user
belongs_to :article
end
class User < ApplicationRecord
has_many :articles
has_many :comments
end
Wit this simple setup an article can only have top level comments, but it's enough to illustrate the point.
Loading our data in the controller
To build the page to view of a single article, our controller might look something like this.
def show
@article = Article.find(params[:id])
end
Seems innocent enough. We're fetching the article the user requested by using the :id
param from the URI. Standard Rails stuff!
Building our template
For our template, we want to show a few things.
- Article information
- user name
- title
- body
- Comment information
- user name
- body
- updated_at
Let's say our template looks something like this:
<h1><%= @article.title %></h1>
<div>
<span><%= @article.user.full_name %></span>
</div>
<section>
<%= @article.body %> @
</section>
<section>
<% @article.comments.each do |comment| %>
<div>
<span><%= comment.user %></span>
<span><%= comment.body %></span>
<span><%= comment.updated_at %></span>
</div>
<% end %>
</section>
This should work! However, there is something particularly devious about how it works.
Rails can only help us so much
When Rails fetches data in a controller, it will figure out which queries it needs to send to the database based on the logic in the action (a nickname for methods inside controllers). This is useful, especially when your query is changed over the course of the action. For example, conditionally adding a where
clause for a column if a param is present.
@articles = Article.all
if params[:start_date].present?
@articles.where("created_at > ?", params[:start_date])
end
This makes a single trip to the database. It doesn't need to fetch all of the articles on the first line, and then run a second query inside the if
block. Rails is smart enough to help us out here.
Where Rails can't help us out anymore, is when we hand this query off to our view.
Hidden queries in our views
Let's discuss what an N+1 query is and then take another look at our view. An N+1 query is when your application makes one query (the +1
part), but later on decides it needs more information for each row that was returned by that query (the N
part).
In our view below, the only record we fetched from our database was the current article @article = Article.find(params[:id])
, but if we take a look below, we're displaying information for comments and users too!
<h1><%= @article.title %></h1>
<div>
<span><%= @article.user.full_name %></span>
</div>
<section>
<%= @article.body %> @
</section>
<section>
<% @article.comments.each do |comment| %>
<div>
<span><%= comment.user %></span>
<span><%= comment.body %></span>
<span><%= comment.updated_at %></span>
</div>
<% end %>
</section>
When Rails starts to build your template, it will just start firing off database requests to get the information it needs using ActiveRecord
.
Like a firework, our single query in our controller is going to shoot into the sky, and burst into multiple new queries if our template has unhandled data requirements.
For example, to display the name of the author of the article <%= @article.user.full_name %>
, Rails will have to fetch this record in a new query with something like this (assume the author's user_id
is 2
)
SELECT * FROM users WHERE users.id = 2;
This isn't so bad. This fires off a second database query since we're only trying to display the author for the article. Where things can get really nasty, is inside our comments section.
<% @article.comments.each do |comment| %>
<div>
<span><%= comment.user %></span>
<span><%= comment.body %></span>
<span><%= comment.updated_at %></span>
</div>
<% end %>
To render this portion of the query, Rails will fetch the comments in one query, but it will also make N
number of requests to fetch the authors of those comments.
Using includes to eager load data
We can get by this by building a query in our controller that actually defines multiple queries using the includes
method from ActiveRecord
.
Let's improve our query using the belongs_to
and has_many
assocations we've added to our models.
def show
@article = Article.includes(:user, comments: [:user]).find(params[:id])
end
Now all of the data that our template needs will be fetched within in our controller before Rails attempts to render it. If you're wondering how to know what to pass to includes
, that's where our belongs_to
and has_many
definitions come in. We're querying off of our Article
model which defines belongs_to :user
and has_many :comments
associations.
Thus, eager loading the author of the article and the article's comments looks like this:
Article.includes(:user, :comments)
To eager load the author of each comment we have to use the same Ruby symbols that we used in our Comment
model's belongs_to
statement (in this case it will also be :user
).
Article.includes(:user, comments: [:user])
If you need to load multiple relationships off of comments
, simply add them to the array. You can keep nesting those as well by defining an array for each relationship.
Options for avoiding N+1 queries in your code base
As you might imagine, keeping track of these N+1 queries can be difficult because they happen silently by default. We humans are pretty bad at keeping responsibilities that can easily be forgotten in our heads. I know I am.
Luckily, we have a few options to help us out.
bullet gem
The bullet gem exists for exactly this purpose. It helps locate places in your application where it would benefit from eager loading by either logging info to your Rails console, or displaying JavaScript alerts.
strict_loading
The more bullet-proof solution (forgive me, I had to) is leveraging the new strict_loading
API that shipped in Rails 6.1. This will raise exceptions in your application to prevent the building of N+1 queries altogether. I haven't tried using it yet, but I think I will soon.
Conclusion
If you find yourself rendering data that you don't remember fetching in your controller, odds are ActiveRecord
is going to do what it can to help without being able to consider the performance implications. I hope this was helpful in giving you a few tools that you can leverage when optimizing pages in your Rails applications. The number of database queries isn't the only thing to worry about, but don't worry. We'll discuss other perfomance implications in the near future.