ACTIVE RECORD ASSOCIATIONS TIPS AND TRICKS
After using Active Record in Ruby on Rails all this while I have finally realized that when it comes to associations some of the techniques being used were absolutely unwieldy and literally choked life out of my applications.
Not that these are anything new, but having learned first hand I am compelled to discuss some of the do’s and don’ts when dealing with the relationships in Active Record.
Proper usage of Collect & Select:
As we all know, collect & select are so widely used to iterate over collections and return,
only required collections through one liners. These one liners look beautiful but, we need to use these quite carefully. Actually
we need to use collect only for collecting attributes of one particular Table/Model
COLLECT:
We can use COLLECT only for COLLECTING ATTRIBUTES of existing Collection
Ex: Theme is a model with name as an attribute. To collect all the theme's names, we can use collect as below.
Theme.all.collect{|theme|theme.name}
Its not preferable to use collect for collecting ASSOCIATIONS RECORDS. To get associations records, we can use :joins, with
select attribute or :include.
SELECT:We can use SELECT to select records by checking conditions on existing collection
Ex: Theme is a model with name as an attribute. To select themes which have name,
Theme.all.select{|theme|theme.name.present?}
Its not preferable to use select to verify conditions related to associations. To get
associations records, we can use :joins with select attribute or :include. Using collect or select on associations will fire
lots and lots of unnecessary database queries, which will lead to slowing down of our application. Here we can see the difference.
For example, lets take a complex association hierarchy to see the difference in number of queries being fired, in different
Cases.

# To collect all questions related to a theme
Theme.first.topics.collect{|topic|topic.questions}.flatten
OR
# To get all topics which have at least 1 question
Theme.first.topics.select{|topic|topic.questions.present?}
Both will fire the queries, listed below.

Here, by calling questions association on topics through collect or select, it fired,
- 1 query for getting the required theme
- 5 queries fried for topics.collect{|topic|topic.questions} or topics.select{|topic|topic.questions.present?} (here, number of queries firing = number of topics in given theme.)
collect / select part fired 5 queries, because, there are 5 topics in selected theme. If there are 100 topics in the selected theme, it is going to fire 100 queries, which is not at all necessary.
1) For collecting all the questions of a theme, we can use :joins as shown here.
Question.find(:all,
:joins => "INNER JOIN questions_topics on
questions.id = questions_topics.question_id",
:conditions => "questions_topics.topic_id in
(#{Theme.first.topics.collect(&:id).join(',')})",
:order => "questions.title ASC")
Here, it is getting all the questions related to all the topics of a theme with just 1 query
- 1 query for getting the required theme
- 1 query for getting questions related to all the topics of given theme. (irrespective of the number of topics in given theme.
2) For getting all the topics of a theme, which have at least 1 question, we can do Eager Loading and then, use select as below.
topics = Topic.find(:all, :include => [:questions],
:conditions => "theme_id = 1")
topics.select{|topic|topic.questions.present?
Here, it is getting all the questions related to all the topics of a theme with just 1 query
- 1 query for getting the required theme
- 1 query for getting questions related to all the topics of given theme. (irrespective of the number of topics in given theme.
About :include and :joins
- Joins joins tables together in sql
- Include eager loads associations to avoid the n+1 problem (where one query is executed to retrieve the record and then one per association which is loaded).
Similarity:
- Both Include and Joins are used to get data from 2 or more tables at a time.
- Both are interchangeable most of the times with some syntactical changes.
- Both work at any given situation, but best usage depends upon the requirement.
Differences:Joins:
- We can select the fields to fetch from different tables
- It will just joins the tables together in sql
- Its light weight and fast, as we can select the fields and only joins tables
Include:
- We can not select the fields to fetch. Even we try to select, it will just neglect. It will fetch all the fields from all the matching records from included tables
- It will eager load the associations into memory
- Its heavy and slow, as it eager loads all the associations into memory
Usage:Without using joins or includes, if we try to access records and their assocations, there will be soo many queries as showed below.
Theme.first.topics.collect{|topic|topic.questions}.flatten.uniq
.collect{|question|question.answers}.flatten.uniq
It will fire below queries.

if 1 theme has 5 topics, each topic has 5 questions,it is firing
- 1 for getting topics of theme
- (number of topics = 5) for getting all questions of a theme
- (number of topics * number of questions per each topic = 5*5 = 25) for getting all answers of a theme
- Total 32 queries in this case to get all answers of one theme
if 1 theme has 10 topics, each topic has 10 questions,it will fire queries,
- 1 for getting topics of theme
- (number of topics = 10) for getting all questions of a theme
- (number of topics * number of questions per each topic = 10*10 = 100) for getting all answers of a theme
- Total 112 queries in this case to get all answers of one theme.
Using :joins will joins different tables to get required results to with selected fields. In this procedure, we can manually select all the required fields from different tables. Otherwise, there will be extra queries fired for association records.
questions = Question.find(:all,
:joins => "INNER JOIN questions_topics on
questions.id = questions_topics.question_id",
:conditions => "questions_topics.topic_id in
(#{Theme.first.topics.collect(&:id)
.join(',')})", :order => "questions.title ASC")
It fires, below sql queries

questions.each{|question|question.answers.size}
fires below sql queries

if 1 theme has 5 topics, each topic has 5 questions,it is firing
- 1 for getting topics of theme
- 1 for getting all questions of a theme
- (number of questions = 25) (for getting all answers of a theme)
- Total 28 queries in this case
if 1 theme has 10 topics, 1 topic has 10 questions,it will fire (1+1+1+100) queries total
- 1 for getting topics of theme
- 1 for getting all questions of a theme
- (number of questions = 100) for getting all answers of a theme (for getting all answers of a theme)
- Total 103 queries in this case.
Using :include will
eager load, all the associations tables into memory, along with the main table. So in this Procedure, it wont fire extra queries for association records
questions_with_answers =
Question.find(:all,:include => [:answers],
:joins => "INNER JOIN questions_topics on
questions.id = questions_topics.question_id",
:conditions => "questions_topics.topic_id in
(#{Theme.first.topics.collect(&:id)
.join(',')})", :order => "questions.title ASC")
It fires, below sql

questions_with_answers.each{|question|question.answers.size}
It wont Fire any queries here.if 1 theme has 5 topics, each topic has 5 questions,it is firing
- 1 for getting topics of theme
- 1 for getting all questions of a theme
- (number of questions = 2 quereis as we are has_many through association here) (for getting all answers of a theme)
- Total 5 queries in this case
if 1 theme has 10 topics, each topic has 10 questions,it will fire
- 1 for getting topics of theme
- 1 for getting all questions of a theme
- (number of questions = 2 quereis as we are using has_many through association) (for getting all answers of a theme)
- Total 5 queries in this case as well.
References:
- http://railscasts.com/episodes/181-include-vs-joins?view=asciicast
- http://www.fortytwo.gr/blog/18/9-Essential-Rails-Tips
Conclusion:
- Use Collect only for collecting attributes from existing collection
- Always better to use joins or include for fetching data from different tables or handling Associations, depending upon Requirement
- Use include , when we need all the association table's records along with main table
- Use Joins, when we need to select very few Columns from different Tables.