Reading Rails - More Migrations


In our last amble through the Rails source, we investigated how Rails finds migrations, this time around we will look into how it knows which migrations need to be applied. While doing this, we will learn new idioms and approaches by reading real world code.

To follow along, open each library in your editor with qwandry, or just look it up on Github. Open up migration.rb in ActiveRecord, we’ll be picking up where we left off last time.

Creating the Migrator

We finished last time with a set of MigrationProxy objects that the Migrator class had created. Rails creates an instance of Migrator using these proxy objects:

def initialize(direction, migrations, target_version = nil)
  @direction         = direction
  @target_version    = target_version
  @migrated_versions = nil

  @migrations = migrations
  
  validate(@migrations)

  ActiveRecord::SchemaMigration.create_table
end

The direction and target_version determine which migrations are run. If there is no target_version, the Migrator migrates everything that has not run yet. We will assume that we are migrating :up to the latest version. These arguments are stored, and then Rails validates the migrations. validate contains a few oddities, so let’s examine it:

def validate(migrations)
  name ,= migrations.group_by(&:name).find { |_,v| v.length > 1 }
  raise DuplicateMigrationNameError.new(name) if name

  version ,= migrations.group_by(&:version).find { |_,v| v.length > 1 }
  raise DuplicateMigrationVersionError.new(version) if version
end

validate groups the migrations by name and version and verifies that no group has more than one entry. group_by is defined on Enumerable, and will return hash of arrays:

numbers = [1.23, 1.5, 2.6, 2.0, 3.7]
numbers.group_by{ |n| n.to_i } 
#=> {1=>[1.23, 1.5], 2=>[2.6, 2.0], 3=>[3.7]}

The key is the result of the block, and the value is an array of the entries for that key.

First the migrations are grouped by name and then version using the &:name and &:version shorthands. Once grouped, it searches the list using find which returns the first element matching the block. Since we are iterating over a hash with find, the key and value will both be passed to the block. Using an underscore in place of the key is an idiom implying the block is not interested in the argument.

The assignment name ,= ... may look invalid, but it is another destructuring trick. In this case it means that the first value of the array returned by find is assigned to name, and the remainder is ignored. Here are some more examples of this type of destructuring:

a    = [1,2,3] #=> a is [1,2,3]
a,   = [1,2,3] #=> a is 1
a,b  = [1,2,3] #=> a is 1, b is 2

This could also have been written name, _ = ....

SchemaMigration

Turn your attention back to Migration#initialize. The last expression is ActiveRecord::SchemaMigration.create_table. Let’s take a look at SchemaMigration in schema_migration.rb. Did you know that every Rails application shares one ActiveRecord model in common?

module ActiveRecord
  class SchemaMigration < ActiveRecord::Base
    #...
  end
end

Just like your own models, Rails uses ActiveRecord models to interact with the database. Unlike your models though, ActiveRecord’s migration code can’t rely on migrations, so it bootstraps this process with SchemaMigration.create_table which creates the schema_migrations table in your database.

def self.create_table(limit=nil)
  unless connection.table_exists?(table_name)
    version_options = {null: false}
    version_options[:limit] = limit if limit

    connection.create_table(table_name, id: false) do |t|
      t.column :version, :string, version_options
    end
    connection.add_index table_name, :version, unique: true, name: index_name
  end
end

Go ahead and inspect your database, you’ll see the schema_migrations table there if you have run any migrations.

Rails uses the same schema definition code that you use when defining your own tables. The unless connection.table_exists?(table_name) ensures that this table is only created if it doesn’t exist yet. This is how Rails persists the list of applied migrations.

Migrating

Once a Migrator has been instantiated, migrate is called on it. This method is responsible for selecting the migrations to be run, and then executing each migration.

def migrate
  #...
  running = runnable

  running.each do |migration|
    #...
  end
end

The pending migrations are selected in runnable.

def runnable
  runnable = migrations[start..finish]
  if up?
    runnable.reject { |m| ran?(m) }
  else
    #...
  end
end

def start
  up? ? 0 : (migrations.index(current) || 0)
end

def finish
  migrations.index(target) || migrations.size - 1
end

Rails indexes into the array of migrations picking the ones in the range start..finish. If you haven’t come across ranges in Ruby yet, they have many uses, but one of which is to extract array elements:

desserts = [:cake, :pie, :cookies, :ice_cream]
# Inclusive (two dots)
desserts[0..2]   #=> [:cake, :pie, :cookies]
# Exclusive (three dots)
desserts[0...2]  #=> [:cake, :pie]

In this case, start and finish are methods on Migrator. Depending on the direction, start will be zero, the first migration, or the index of the last applied migration, migrations.index(current). The migrations will finish at either the index of the target or the last migration. Both start and finish use Array#index, to get the index of an array element. Let’s take a look at how that works:

desserts = [:cake, :pie, :cookies, :ice_cream]

# Find an element
desserts.index(:pie)            #=> 1
desserts.index(:chicken)        #=> nil

# Find by a block
desserts.index{|d| d =~ /ice/ } #=> 3

Array#index can be used to find the index of a specific object, or it can be used like Enumerable#find, matching against a block.

Back to Migrator#runnable, it extracts the pending migrations, and then rejects any that have already ran?. So how does it determine whether a migration has been run?

def ran?(migration)
  migrated.include?(migration.version.to_i)
end

def migrated
  @migrated_versions ||= Set.new(self.class.get_all_versions)
end

# Migration.get_all_versions
def get_all_versions
  SchemaMigration.all.map { |x| x.version.to_i }.sort
end

ran? looks for the migration version in the set of @migrated_versions. The most interesting bit here is that Rails gets those versions using SchemaMigration.all, an ActiveRecord model method to query the database. This is a neat example of Rails leveraging ActiveRecord model code. Now that we have dived all the way down to see how the Migrator determines which migrations should run, let us pop back up and look at the remainder of Migrator#migrate:

def migrate
  #...
  running = runnable

  running.each do |migration|
    #...
    begin
      execute_migration_in_transaction(migration, @direction)
    rescue => e
      canceled_msg = use_transaction?(migration) ? "this and " : ""
      raise StandardError, "An error has occurred, #{canceled_msg}all later migrations canceled:\n\n#{e}", e.backtrace
    end
  end
end

The Migrator runs each of these running migrations with execute_migration_in_transaction, but most interesting is how migrate handles exceptions.

If anything goes wrong while running the migration, Rails catches and annotates the exception. The new exception explains the current state of your migrations, and then includes the message from the original exception. Rails also explicitly sets the backtrace of the new exception to that of the original exception, e.backtrace. If Rails did not pass in a backtrace, Ruby would give the user a trace to the point where the new exception was raised, which obscures the cause of the error.

Recap

In this exploration we have found that Rails uses its own ActiveRecord model to track which migrations you have run. While digging through the source, we also came across a few other interesting bits:

  • Enumerable#group_by returns a hash where the keys are the grouping, and the values are the groups.
  • name ,= values is an example of array destructuring.
  • Rails tracks migrations with the schema_migrations table in your database.
  • Arrays can be indexed with ranges.
  • Array#index returns the index of an object, or the first element matching a block.
  • When rescuing an exception and raising a new one, you can pass in the original backtrace.

More articles in this series