Notes about Active Record in Yii framework
Yii comes with a nice implementation of Martin Fowler’s Active Record pattern. Being one of the major components of Yii framework, it nicely handles validation, persistence and querying of the objects stored in a database, so you can focus on the actual logic instead. Here are some useful and good to know things for Yii programmers.
It’s a finder
Instead of talking about CActiveRecord objects in general, let’s take Post class as an example of an AR model.
Besides representing your marvelous blog posts, each Post object also acts as
a finder of posts. When you call Post::model()
you get a special instance of
Post
which serves as a finder among other things. This is also highlighted in
the documentation:
It is provided for invoking class-level methods (something similar to static class methods.)
It’s explicitly stated that returned object should be used to invoke
class-level methods, or roughly speaking, methods which work on collections of
Post objects rather than one specific instance of Post. All find*
methods
are class-level methods because they operate on collection.
To act as a finder, all Post objects hold criteria object inside them –
instance of CDbCriteria. Let’s call it inner criteria and that’s what you
get when you call $post->getDbCriteria()
. When you call various scopes
you’re actually modifying inner criteria (that’s why I like to think about
scopes as criteria modifiers). Relational query methods like together
and
with
also modify it. When we initially call Post::model()
, inner criteria
is clean, that is, it does not have any SQL conditions applied. When we
further chain scopes like Post::model()->published()->recent()
criteria gets
modified and remembers all query details like condition, ordering, limit,
joins, etc (provided we properly wrote these scopes). Finally, when we fire
off one of the find*
or count*
methods (these methods also can receive
additional criteria object which will be merged with the inner criteria), the
actual database query is performed and inner criteria is reset to clean state.
The last step is very important, because if criteria isn’t reset, all
subsequent queries will still be using old criteria details.
So here comes the not so obvious part – this inner criteria is shared by ALL
finder instances of Post
. Let’s have a look at this example to understand
why it matters.
So far everything is great. When we go to /post/index
we see all published
blog posts. Now suppose we also want to show latest popular posts. Let’s use
named scope popular
for that. Our controller and view will be changed a bit.
Now when we go the posts page, we suddenly see ALL posts under “Posts”
header, both published and not published. Seems like
Post::model()->published()
isn’t working when we pass it to the
CActiveDataProvider
. Why? Remember that inner criteria object is shared
among all Post finders? When we pass Post::model()->published()
to the
provider, the actual database query isn’t performed, because findAll
is NOT
called yet (it will be called when CListView inside the view gets rendered).
When we call Post::model()->popular()->findAll()
, criteria object inside
Post is reset. So all Post finders, including the one which sits inside
provider, now have a clean criteria. When the CListView gets rendered, it’s
too late, criteria is already clean, so the CActiveDataProvider fetches all
posts. To overcome this, we can either get popular posts before creating data
provider:
or in case of some complicated scenario we can save and restore the inner criteria:
I took me some time to figure out what’s happening when I first discovered it, so you’d better keep it in mind. Now let’s take a peek inside CActiveRecord to understand why criteria it shared:
Wow, actually the whole static model is cached in a static variable $_models
to make things faster, so that means we get the same instance when we call
Post::model()
. Which of course implies that inner criteria will be the same
too.
Searching models based on data inside related models
Another thing I had some troubles with is getting related models and, at the same time, using relational data to filter primary models. Scenario:
- A (primary model) has many B-s
- You want to search A and also fetch all of it’s B-s using
with
option - You want to select only some A-s based on data in B
To illustrate it, let’s add tags to posts:
Note that we use additional table post_tag
to store MANY_MANY
relations.
In the post listing we show tags as links:
Now we should output posts for given tag when /post/index?tag=tagname
is
requested. No problem, we already have a relation for that:
This code correctly finds tagged posts, but has another problem – only the
requested tag is shown in a list of tags for all found posts. For example, if
we had visited /post/index?tag=gaming
page, we’d see only “gaming” in
“Tagged:” section of every post, even if they have more tags. This is correct
behavior, as we’re explicitly restricting related tags to only those with
specified name. To get all related tags we can introduce an additional join to
filter posts:
Note that if you don’t want to load related models, there is no need in an
additional join, just use the select
option:
Using scopes safely
Scopes are great. They allow us to refactor monstrous find*
invocations and
break them into simple, maintainable and reusable methods.
Since all AR query stuff eventually gets converted into SQL equivalent, most errors arise from naming conflicts. We don’t need to worry about it when calling one of the AR query methods with additional criteria parameter, since we see all table, column names and aliases right where the call happens. But when writing scopes, we should take extra measures to prevent errors. This is because we don’t know in advance all the places where this scope is going to be used, so we must ensure there isn’t anything hardcoded.
Quote table, alias and column names
Table alias
As noted in the documentation, table alias may vary in relational queries, so we don’t know it in advance. The good example is already shown above.
Table names
If you’ll ever need to know name of a table, all AR models define method “tableName”.
Use unique names for binding parameters
We could have invented our own solution for this, but CDbCriteria already
provides static $paramCount
property, which is used internally by framework
itself to generate unique parameter names.
Merge default scopes of parent classes
If both child and parent AR classes have defined defaultScope, do not try to
merge them with something like array_merge
, use CDbCriteria::mergeWith
instead:
Also …
You should be able to debug AR stuff by looking at generated SQLs. I use CWebLogRoute to see all SQL queries at the bottom of a page, so I can quickly search. If you can’t find your query just use some unique alias or SQL comment (like “abracadabra”) and search for that.
Of course we can fall back to SQL, but AR is much more convenient to use over plain arrays, especially if models contain complicated business logic. Also, some components like CActiveDataProvider work only with AR.