Reading Rails - Migrations
Today we’re going to talk about an often ignored workhorse of Rails, the Migrator
. How does it find your migrations and run them? We will amble through the Rails source, and pick up whatever bits of knowledge we find along the way.
To follow along, open each library in your editor with qwandry, or just look it up on Github.
In the Beginning
In the beginning, there’s nothing. Maybe you have your database, but it’s empty. If you call rake db:migrate
, all the pending migrations will run. Let’s start off by looking at that Rake task in databases.rake
:
desc "Migrate the database (options: VERSION=x, VERBOSE=false, SCOPE=blog)."
task :migrate => [:environment, :load_config] do
ActiveRecord::Migration.verbose = ENV["VERBOSE"] ? ENV["VERBOSE"] == "true" : true
ActiveRecord::Migrator.migrate(ActiveRecord::Migrator.migrations_paths, ENV["VERSION"] ? ENV["VERSION"].to_i : nil)
#...
end
We’ll gloss over the details of how Rake itself works, but for the time being notice that migrate
requires two other tasks [:environment, :load_config]
to be run first. This ensures that the Rails environment and your database.yml
has been loaded.
The body of the rake task configures ActiveRecord::Migration
and ActiveRecord::Migrator
using an environment variables. Environment variables are a useful way of passing information to your application. Many variables are set by default such as USER
. They can also be set on a per command basis. For instance if you invoked Rake with VERBOSE=false rake db:migrate
, then ENV["VERBOSE"]
would be the string "false"
.
# Invoke irb with an environment variable:
# > FOOD=cake irb
ENV['FOOD'] #=> 'cake'
ENV['USER'] #=> 'adam'
ENV['WAFFLES'] #=> nil
The actual migration gets kicked off with ActiveRecord::Migrator.migrate
, which is being given a set of paths where migrations might exist, and an optional version to migrate to.
Finding Migrations
Pop open ActiveRecord’s migration.rb
. Before we dig into this, take a moment to look over the exceptions defined at the top of the file. It is very easy to define custom exceptions, and migration.rb
has a few good examples of them:
module ActiveRecord
# Exception that can be raised to stop migrations from going backwards.
class IrreversibleMigration < ActiveRecordError
end
#...
class IllegalMigrationNameError < ActiveRecordError#:nodoc:
def initialize(name)
super("Illegal name for migration file: #{name}\n\t(only lower case letters, numbers, and '_' allowed)")
end
end
#...
Custom exceptions can be specially handled as we saw in the previous article about how rails handles exceptions. In this case, IrreversibleMigration
signals that migration cannot be backed out. Another reason to define your own exceptions is to generate consistent error messages as IllegalMigrationNameError
does by overriding initialize
. Just be sure that you call super
.
Now scroll down and let’s look at Migrator.migrate
:
class Migrator
class << self
def migrate(migrations_paths, target_version = nil, &block)
case
when target_version.nil?
up(migrations_paths, target_version, &block)
#...
when current_version > target_version
down(migrations_paths, target_version, &block)
else
up(migrations_paths, target_version, &block)
end
end
#...
Depending on the target_version
we will either migrate up
or down
. Both methods follow the same pattern, they scan the migration_paths
for available migrations, and then initiate a new Migrator
instance. Let’s see how those migrations are located:
class Migrator
class << self
def migrations(paths)
paths = Array(paths)
files = Dir[*paths.map { |p| "#{p}/**/[0-9]*_*.rb" }]
migrations = files.map do |file|
version, name, scope = file.scan(/([0-9]+)_([_a-z0-9]*)\.?([_a-z0-9]*)?\.rb\z/).first
raise IllegalMigrationNameError.new(file) unless version
version = version.to_i
name = name.camelize
MigrationProxy.new(name, version, file, scope)
end
migrations.sort_by(&:version)
end
This method is chock full of useful examples, so let’s settle down for a few minutes and read it carefully. We start off with a little trick used to ensure arguments are always arrays, the Array()
method. Method you say? Although unorthodox, it is valid to define CamelCase methods, even if they share the name of a class:
class Flummox
end
def Flummox()
"confusing"
end
Flummox #=> Flummox
Flummox.new #=> #<Flummox:0x0000000bf0b5d0>
Flummox() #=> "confusing"
Ruby uses this to define an Array()
method, which always returns an Array
instance:
Array(nil) #=> []
Array([]) #=> []
Array(1) #=> [1]
Array("Hello") #=> ["Hello"]
Array(["Hello", "World"]) #=> ["Hello", "World"]
This is similar to to_a
, but can be called on any object. Rails uses this with paths = Array(paths)
to ensure that paths will always be an array.
Next Rails searches those paths and filters them all in one impressive line:
files = Dir[*paths.map { |p| "#{p}/**/[0-9]*_*.rb" }]
Let’s unpack that from the inside out. paths.map { |p| "#{p}/**/[0-9]*_*.rb" }
converts each path into a [shell glob](http://en.wikipedia.org/wiki/Glob_(programming\)). A path like "db/migrate"
becomes "db/migrate/**/[0-9]*_*.rb"
, which will match any file inside "db/migrate"
or any of its sub directories as long as they start with a digit. Those paths are then splatted with the *
operator and passed to Dir[]
.
Dir[]
is extremely useful. It takes patterns like "db/migrate/**/[0-9]*_*.rb"
, and returns an array of matching files. Keep Dir[]
at hand whenever you need to find files based on a path. The **
will recursively match sub directories, and *
is a wildcard for one or more characters, so this pattern will match migrations like 20131127051346_create_people.rb
. .
Rails iterates over each matching file, and plucks out information using a regular expression with String#scan
. If you’re not familiar with regular expressions, drop everything and learn them now. String#scan
returns all the matches in a given string. If the expression contains capturing groups, those are returned in subarrays. For example:
s = "123 abc 456"
# No capturing groups:
s.scan(/\d+/) #=> ["123", "456"]
s.scan(/\d+\s\w+/) #=> ["123 abc"]
# Capturing a number and then a word:
s.scan(/(\d+)\s+(\w+)/) #=> [["123", "abc"]]
So file.scan
will match the version ([0-9]+)
, a name ([_a-z0-9]*)
, and then optionally a scope ([_a-z0-9]*)?
. Since String#scan
always returns an array, and we know this pattern will only appear once, Rails just plucks off the first match. Rails assigns version, name, scope = ...
all at once. This is done with array destructuring:
version, name, scope = ["20131127051346", "create_people"]
version #=> "20131127051346"
name #=> "create_people"
scope #=> nil
Notice that if there are more variables than array elements, the remaining variables will be assigned nil
. This is a handy shortcut when assigning values from a regular expression.
The version is converted to an integer (Fixnum) using to_i
, and the name is reformatted with name.camelize
. String#camelize
is defined by ActiveSupport, and refers to the conventions of snake_case
vs CamelCase
. This method will convert a string "create_people"
into "CreatePeople"
.
Let’s save MigrationProxy
for a moment and look at the final part of this method, migrations.sort_by(&:version)
. This expression sorts all the migrations using their version
. How it sorts them though is rather interesting.
As of Ruby 1.9, &
will call to_proc
on whatever it precedes. When called on a symbol, the result is a Proc
which calls the method named by the symbol. So &:version
evaluates to something along the lines of {|obj| obj.version }
.
Library = Struct.new(:name, :version)
libraries = [
Library.new("Rails", "4.0.1"),
Library.new("Rake", "10.1.0")
]
libraries.map{|lib| lib.version } #=> ["4.0.1", "10.1.0"]
# &:version => Proc.new{|lib| lib.version } (Roughly)
libraries.map(&:version) #=> ["4.0.1", "10.1.0"]
This is often used when sorting or mapping in Rails. As with all shorthands, make sure your team is comfortable with this syntax. When in doubt, the alternative is not much longer, and more clear.
The Migration
Now, return to MigrationProxy
. As its name implies, this is a proxy for Migration
instances. Proxy objects are a common design pattern used to transparently replace one object with another object. In this case the MigrationProxy
is a stand in for a real Migration
object, but defers actually loading the migration’s source unless it is needed. MigrationProxy
achieves this by delegating methods:
class MigrationProxy
#...
delegate :migrate, :announce, :write, :disable_ddl_transaction, to: :migration
private
def migration
@migration ||= load_migration
end
def load_migration
require(File.expand_path(filename))
name.constantize.new
end
end
The delegate
method sends each of its arguments to the object returned by the to:
option, which is migration
in this case. migration
will lazily load_migration
if @migration
has not yet been set. load_migration
in turn requires the ruby source, and then creates an instance using name.constantize.new
. String#constantize
is defined by ActiveSupport, and returns the constant named by a string:
"Person".constantize #=> Person
"Person".constantize.class #=> Class
"person".constantize #=> NameError: wrong constant name person
This can be very helpful when you want to dynamically reference a class.
Using MigrationProxy
, Rails only loads and instantiates migrations if they are required, which speeds up the migration process and saves some memory.
The actual Migration
class gets called by the Migrator
when the proxy delegates the migrate
method. This in turn calls either Migration#up
or Migration#down
depending on if the migration is being applied or rolled back.
Recap
We have only scratched the surface of Rails’ migration code, but we’ve learned some interesting things all the same. Migrations are started with a Rake task, which invokes the Migrator
. The Migrator
in turn finds our migrations and wraps them with MigrationProxy
objects until the real Migration
is needed.
As always, we have come across a number of interesting methods, idioms, and tricks:
- Environment variables can be accessed via the
ENV
constant. - Defining custom exceptions is a common idiom for error handling.
Array()
converts any object into an array.Dir[]
uses the shell glob syntax to search for files.String#scan
returns all the matches in a string, and supports capturing groups.String#camelize
converts snake_case strings to CamelCase.- The
&
operator creates aProc
when called on a symbol. delegate
can be used to implement the proxy design pattern.- You can dynamically load constants with
String#constantize
.
Next time perhaps we can figure out exactly how the Migrator
knows which migrations have been applied to your database.
More articles in this series
- Reading Rails - HTTP DELETEs With a Link
- Reading Rails - Time Travel
- Reading Rails - TimeWithZone
- Reading Rails - How Does MessageEncryptor Work?
- Reading Rails - How Does MessageVerifier Work?
- Reading Rails - How Do Batched Queries Work?
- Reading Rails - The Adapter Pattern
- Reading Rails - Errors and Validators
- Reading Rails - How Validations Are Configured
- Reading Rails - Concern
- Reading Rails - More Migrations
- Reading Rails - Migrations
- Reading Rails - Attribute Methods
- Reading Rails - Change Tracking
- Reading Rails - Handling Exceptions