Getting to Know the Ruby Standard Library – Pathname
Pathname is useful library that demonstrates a good refactoring:
“Replace Data Value With Object”. In this case the data value is a
String representing a path.
Pathname wraps that
String and provides a wide variety of methods for manipulating paths that would normally require you to call the
File,
FileStat,
Dir, and
IO modules. You might even be using it already without knowing as it shows up in
Rails’ paths. First we will see a short example of
Pathname in action, and then we will look at some of the patterns it employs.
Example of Pathname
require 'pathname'
path = Pathname.new('.') # current directory
path += 'tests' # ./tests
path += 'functional' # ./tests/functional
path = path.parent # ./tests
path += 'config.yaml' # ./tests/config.yaml
path.read # contents of ./tests/config.yaml
path.open('w'){|io| io << "env: test"}
path.read # "env: test"
path.children{|p| puts p.inspect} # prints all the files/directories in ./tests
Pathname provides a nicer interface for interacting with the filesystem, now lets take a look at how it works. As usual, I suggest opening up the file for yourself and following along, if you have
Qwandry installed you can type
qw pathname.
Pathname
We will start with how a
Pathname gets created:
def initialize(path)
path = path.__send__(TO_PATH) if path.respond_to? TO_PATH
@path = path.dup
#...
The main thing
Pathname#initialize does is store a copy of the
path argument, while optionally calling
TO_PATH on it, we’ll come back to this in a moment. Since strings are mutable in ruby,
dup is called on the
path argument. This ensures that if you later call
path.gsub!('-','_'), or any other method that mutates the string,
Pathname‘s copy will remain the same. This is a good practice whenever you are dealing with mutable data. Now lets take a look at
TO_PATH:
if RUBY_VERSION < "1.9"
TO_PATH = :to_str
else
# to_path is implemented so Pathname objects are usable with File.open, etc.
TO_PATH = :to_path
end
This code invokes special behavior based on the current
RUBY_VERSION. Ruby 1.9 will set
TO_PATH to
:to_path, and call that in the initializer above if the object being passed in implements
to_path. A quick look at the RDocs show that
File implements
to_path, so we can pass files directly into
Pathname. Now let’s take a look at how
Pathname makes use of the rest of ruby’s file libraries.
def read(*args)
IO.read(@path, *args)
end
The definition of
Pathname#read is quite simple, it just takes the path you passed in and uses it to call
IO, so where you might have done
IO.read(path) with
Pathname you can just do
path.read. This pattern is repeated in
Pathname for many of the common filesystem operations, for instance take a look at
mtime:
def mtime()
File.mtime(@path)
end
We see the same pattern has been repeated, but this time it delegates to
File. Since a
Pathname may reference a file or a directory, some of the methods will delegate to either
Dir or
File:
def unlink()
begin
Dir.unlink @path
rescue Errno::ENOTDIR
File.unlink @path
end
end
First it tries to delete the path as a directory, then as a file. Perhaps a simpler formulation would be
directory? ? Dir.unlink @path : File.unlink @path, but the result is the same. This pattern encapsulates knowledge that the caller no longer needs to deal with.
Pathname also overrides operators where they make sense, which lets you concatenate paths. Let’s look at how
Pathname does this.
def +(other)
other = Pathname.new(other) unless Pathname === other
Pathname.new(plus(@path, other.to_s))
end
The plus operator is just a method like any other method in ruby, so overriding it is pretty simple. First, the
other path being added to this one is converted to a
Pathname if it isn’t one already. After that, the paths are combined with
plus(@path, other.to_s). This might look rather odd since we just converted
other to a
Pathname, but remember that
Pathname treats anything responding to
to_path specially.
Here are some examples of its behavior:
p = Pathname.new('/usr/local/lib') #=> #<Pathname:/usr/local/lib>
p + '/usr/' #=> #<Pathname:/usr/>
p + 'usr/' #=> #<Pathname:/usr/local/lib/usr/>
p + '../include' #=> #<Pathname:/usr/local/include>
Adding an absolute path to an existing path behaves differently from a relative path or a path referencing the parent directory. This obviously has some logic beyond our typical string operators. For the sake of brevity, we can skip the details of how
plus is implemented, though if anyone is interested, we can dissect it later. I suggest skimming the rest of
pathname.rb, look at how public and private methods are defined, and how they are used to simplify methods.
Overview
Pathname wraps up a lot of functionality that is scattered across multiple libraries by encapsulating that information. Hopefully you have seen how
Pathname can be useful, and have also learned a few patterns that will make your code more useable.
More articles in this series
- Getting to Know the Ruby Standard Library – Delegator
- Getting to Know the Ruby Standard Library – WeakRef
- Getting to Know the Ruby Standard Library – Timeout
- Getting to Know the Ruby Standard Library – Pathname
- Getting to Know the Ruby Standard Library – Abbrev
- Getting to Know the Ruby Standard Library – TSort
- Getting to Know the Ruby Standard Library – MiniTest::Mock
- Getting to Know the Ruby Standard Library – Shellwords
- Getting to Know the Ruby Standard Library – MiniTest