Ruby Standard Library: Base64
July 10, 2021 in ruby

The Base64 module in the Ruby standard library performs encoding and decoding for binary data using a Base64 representation. In one of my previous post I investigated how the Base64 algorithm works in C#. Let’s try to do the same in Ruby.

Why Base64?

The Base64 encoding is a way of representing som binary data in ASCII text. As this SO post explains, we can’t just stream some binary data over the wire, as it might be interpreted wrongly by a protocol. Hence, it’s safe to

  1. convert (encode) it first to ASCII text
  2. transfer the ASCII text
  3. convert it back (decode) to the original data.

As the RFC states, Base64 encoding of data is also used in many situations to store or transfer data in environments that, perhaps for legacy reasons, are restricted to US-ASCII [1] data. Base encoding can also be used in new applications that do not have legacy restrictions, simply because it makes it possible to manipulate objects with text editors.

The purpose of using base64 to encode data is that it translates any binary data into purely printable characters.

Example

require 'base64'

line = "It's a great time\nto learn Ruby"

# Encode the line
encodedAscii = Base64.encode64(line)

# Output:
# SXQncyBhIGdyZWF0IHRpbWUKdG8gbGVhcm4gUnVieQ==
puts encodedAscii

# Decode the base64 encoded ASCII text
decodedData = Base64.decode64(encodedAscii)

# Output:
# It's a great time
# to learn Ruby
puts decodedData

Under the hood

This is how the Base64 module is implemented in the Ruby standard library. This is a stripped-down version to keep things simple and easy to understand.

# lib/base64.rb

module Base64
  def self.encode64(bin)
    [bin].pack("m")
  end
  
  def self.decode64(str)
    str.unpack("m")
  end
end

You can see that the encode64 method just delegates the call to the pack method on the Array containing bin. So the Array#pack and Array#unpack perform the actual encoding/decoding of the data.

According to the docs:

Packs the contents of arr into a binary sequence according to the directives in aTemplateString.

m: String: base64 encoded string (see RFC 2045)


Side Effects

While reading the source code for this module, here’s what I learned.

module_function

This is part of the base64.rb code.

module Base64
  module_function
  
  def encode64(bin)
    ...
  end
  
  # rest of the code
end

Notice the module_function under the name of the module. I was not sure what it did, but googling a little revealed the docs, and this SO post. Here’s the gist of what this does.

Creates module functions for the named methods. These functions may be called with the module as a receiver, and also become available as instance methods to classes that mix in the module.

If used with no arguments, subsequently defined methods become module functions.

So the code is calling module_function before the other methods so you can call them as Base64.encode64(str).

However, it looks like you can simply achieve this using the self keyword on the method.

module Base64
  
  def self.encode64(bin)
    ...
  end
  
  # rest of the code
end

Personally, I prefer the second syntax.