Getting to Know the Ruby Standard Library – Pathname


Pathname is useful library that demonstrates a good refactoring: “Replace Data Value With Object”. In this case the data value is a String representing a path.
Pathname wraps that String and provides a wide variety of methods for manipulating paths that would normally require you to call the File, FileStat, Dir, and IO modules. You might even be using it already without knowing as it shows up in Rails’ paths. First we will see a short example of Pathname in action, and then we will look at some of the patterns it employs.

Example of Pathname

require 'pathname'
  path = Pathname.new('.') # current directory
  path += 'tests'          # ./tests
  path += 'functional'     # ./tests/functional
  path = path.parent       # ./tests
  path += 'config.yaml'     # ./tests/config.yaml
  path.read                # contents of ./tests/config.yaml
  path.open('w'){|io| io << "env: test"}
  path.read                # "env: test"
  path.children{|p| puts p.inspect} # prints all the files/directories in ./tests

Pathname provides a nicer interface for interacting with the filesystem, now lets take a look at how it works. As usual, I suggest opening up the file for yourself and following along, if you have Qwandry installed you can type qw pathname.

Pathname

We will start with how a Pathname gets created:

def initialize(path)
  path = path.__send__(TO_PATH) if path.respond_to? TO_PATH
  @path = path.dup
  #...

The main thing Pathname#initialize does is store a copy of the path argument, while optionally calling TO_PATH on it, we’ll come back to this in a moment. Since strings are mutable in ruby, dup is called on the path argument. This ensures that if you later call path.gsub!('-','_'), or any other method that mutates the string, Pathname‘s copy will remain the same. This is a good practice whenever you are dealing with mutable data. Now lets take a look at TO_PATH:

if RUBY_VERSION < "1.9"
  TO_PATH = :to_str
else
  # to_path is implemented so Pathname objects are usable with File.open, etc.
  TO_PATH = :to_path
end

This code invokes special behavior based on the current RUBY_VERSION. Ruby 1.9 will set TO_PATH to :to_path, and call that in the initializer above if the object being passed in implements to_path. A quick look at the RDocs show that File implements to_path, so we can pass files directly into Pathname. Now let’s take a look at how Pathname makes use of the rest of ruby’s file libraries.

def read(*args) 
  IO.read(@path, *args) 
end

The definition of Pathname#read is quite simple, it just takes the path you passed in and uses it to call IO, so where you might have done IO.read(path) with Pathname you can just do path.read. This pattern is repeated in Pathname for many of the common filesystem operations, for instance take a look at mtime:

def mtime() 
  File.mtime(@path) 
end

We see the same pattern has been repeated, but this time it delegates to File. Since a Pathname may reference a file or a directory, some of the methods will delegate to either Dir or File:

def unlink()
  begin
    Dir.unlink @path
  rescue Errno::ENOTDIR
    File.unlink @path
  end
end

First it tries to delete the path as a directory, then as a file. Perhaps a simpler formulation would be directory? ? Dir.unlink @path : File.unlink @path, but the result is the same. This pattern encapsulates knowledge that the caller no longer needs to deal with.

Pathname also overrides operators where they make sense, which lets you concatenate paths. Let’s look at how Pathname does this.

def +(other)
  other = Pathname.new(other) unless Pathname === other
  Pathname.new(plus(@path, other.to_s))
end

The plus operator is just a method like any other method in ruby, so overriding it is pretty simple. First, the other path being added to this one is converted to a Pathname if it isn’t one already. After that, the paths are combined with plus(@path, other.to_s). This might look rather odd since we just converted other to a Pathname, but remember that Pathname treats anything responding to to_path specially.

Here are some examples of its behavior:

p = Pathname.new('/usr/local/lib') #=> #<Pathname:/usr/local/lib> 
p + '/usr/'                        #=> #<Pathname:/usr/> 
p + 'usr/'                         #=> #<Pathname:/usr/local/lib/usr/>
p + '../include'                   #=> #<Pathname:/usr/local/include>

Adding an absolute path to an existing path behaves differently from a relative path or a path referencing the parent directory. This obviously has some logic beyond our typical string operators. For the sake of brevity, we can skip the details of how plus is implemented, though if anyone is interested, we can dissect it later. I suggest skimming the rest of pathname.rb, look at how public and private methods are defined, and how they are used to simplify methods.

Overview

Pathname wraps up a lot of functionality that is scattered across multiple libraries by encapsulating that information. Hopefully you have seen how Pathname can be useful, and have also learned a few patterns that will make your code more useable.

More articles in this series