Hash Data Structure in Ruby

March 5, 2022   • ruby

When I used to program in C# (or even Java), one of the topics that always puzzled me was when to use which class. There are thousands and thousands of classes in the framework. For example, here are the five types that implement the IDictionary interface.

  • Hashtable
  • SortedList
  • SortedList<TKey, TValue>
  • Dictionary<TKey, TValue>
  • ConcurrentDictionary<TKey, TValue>

Of course, there’s a use for each, and I don’t doubt the decisions of the framework creators. But as a programmer, having so many choices can be really daunting and confusing. When do you choose which type? What if you made a wrong choice?

In contrast, Ruby has a single Hash class, which is a versatile data structure. It can act as a data object, a dictionary, a hash table, a sorted list, and much more. This post explores Ruby Hashes in detail and lists some obscure but useful methods in day-to-day programming.

What is a Hash?

A Hash is a general-purpose data structure for storing named data. It contains unique keys, and each key has a single mapped value. When given a key, a Hash provides you with the data. In addition, hashes in Ruby can have almost any object as their key.

You can create a hash using the following syntax, which looks similar to a JSON object:

person = { name: "Akshay", age: 31 }

However, this assumes that the key will always be a symbol. If you want to have any other values as keys, then you have to use the older syntax.

data = { "foo" => "bar", 100 => "One Hundred" }

To retrieve the data, use the array notation:

=> "Akshay"
=> "One Hundred"

A Hash preserves the order of the entries. New entries are added at the end. This matters when using the iterative methods such as each, each_key, each_pair, and each_value.

Default Values

If a hash doesn’t contain the key, it returns nil. However, you can set the default value for the hash using its default property. Alternatively, you can also use a block when initializing the hash object.

person = { name: "Akshay", age: 31 }
=> nil
person.default = "-"
=> "-"
=> "-"

person = Hash.new { |hash, key| "Default value for #{key}" }
=> "Default value for city"

Useful Hash Methods

  • compact

Returns a copy of the hash with all nil-valued entries removed. The original hash is not modified (use compact! instead).

h = { foo: 0, bar: false, bam: nil }
=> {:foo=>0, :bar=>false, :bam=>nil}
=> {:foo=>0, :bar=>false}
  • empty?

Returns true if there are no hash entries, false otherwise.

user = { name: 'dhh', company: 'basecamp' }
=> {:name=>"dhh", :company=>"basecamp"}
=> false
=> true
  • hash.eql? obj

Returns true only if the following conditions are true:

  1. obj is a Hash
  2. hash and obj have the same keys (order doesn’t matter)
  3. For each key, hash[key].eql? obj[key]

This is different from equal? which returns true if and only if both values refer to the same object.

user = { name: 'David', company: 'basecamp' }
dhh = { name: 'David', company: 'basecamp' }
user.eql? dhh
=> true
user.equal? dhh
=> false
  • hash.except(*keys)

Returns a new Hash without the entries for the given keys

h = { a: 100, b: 200, c: 300 }
h.except(:a)          #=> {:b=>200, :c=>300}
  • fetch_values(*keys)

Returns an array containing the values for the given keys. If a block is passed, it’s called for each missing key. The return value of the block is used for the key’s value.

h = {foo: 0, bar: 1, baz: 2}
h.fetch_values(:baz, :foo) # => [2, 0]
h.fetch_values(:baz, :tom) { |key| key.to_s } # [2, "tom"]

The methods has_key, member?, include?, and key? all check if the hash contains the given key.

user = {:name=>"dhh", :company=>"basecamp"}

user.has_key? :name
=> true
user.key? :name
=> true
user.member? :name
=> true
user.include? :name
=> true
  • The methods keys and values return an array containing the keys and values of the hash, respectively.
user.keys # [:name, :company]
user.values # ["dhh", "basecamp"]
  • length

Returns the number of entries in the hash.

user.length # 2
  • slice(*keys)

Returns a new hash containing the entries for the given keys.

=> {:name=>"dhh"}
  • transform_values

Returns a new hash with the values transformed by a block that accepts each value. It doesn’t modify the original hash. For changing the original hash, use transform_values! method.

data = { a: 100, b: 200 }
data.transform_values { |v| v * 2 }
=> {:a=>200, :b=>400}
irb(main):248:0> data
=> {:a=>100, :b=>200}