How HashMap works internally in Java

How HashMap works internally in Java

One of the frequently asked question for experienced Java developers is “How HashMap works internally in Java?” or “How get() and put() method of HashMap works internally”. If you give a solid answer for this question, chances are more that you will get through the interview.

I assume that you know what is HashMap in Java. If you are new to the concept refer this java doc.

In order to understand it’s internal working I would suggest you to have an understanding of equals and hashcode methods. Refer my previous post equals() and hashcode().

Internal implementation of HashMap in Java

The first thing to know is that HashMap works on the principal of hashing.

HashMap maintains an array of buckets. Each bucket is a linked list of key value pairs encapsulated as Entry<K,V> objects. This array of buckets is also called table. The index of the table is the hash value(hash code) of the key. Below is the pictorial representation of it.

how hashmap works internally

Each node of the linked list is an instance of a class called Entry<K,V>. The Entry class implements Map.Entry. Below is the syntax of the Entry class,

Each entry object is associated with only one particular key but its value may change (if same key is reinserted later with a different value) – hence key is final while value is not.
Each Entry object has a field called next which points to the next Entry thus behaving like a singly linked list. The hash field stores the hashed value of the key which is the index of the table.

Two factors that affect the performance of HashMap are initial capacity and load factor. The capacity is is the size of the table array, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.

When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

Let us consider a simple example to understand how the hashmap works.

Understanding put() method.

Let us see what happens when the put method is called as in the below statement.

map.put(“one”, 1);

First the hashcode() method defined in the HashMap is called to obtain the hashcode of the key. So the hashcode of the key “one” is calculated. This hashcode is passed to an internal hash function which will return the hashed value of the key. This value is nothing but the index of the hash bucket/array.

So the index where this key value pair has to be placed in found. Now the linked list at this index is iterated to see if the same key already exists. The key in each of the entry object in the linked list is compared with the passed key. For this comparison the key’s equals() method is used.

If the key is not already present then an Entry object with the key-value pair(“one”-1 in this case)  is created and added to the linked list. If the key is already present then its value is replaced with the new value.

Understanding get() method

To get the value of a key, the map has to know where or more precisely in which bucket to search for the key. For this, first the hashcode of the key is calculated which is then passed to the hash function to obtain the index. Once the index is found, as mentioned in put method, the linked list at the index will be iterated and equals() method will be called to find the matching key. Once it is found the value present in the particular Entry object which is associated with the key will be returned.

In summary below are the three important steps that takes place for put() and get(),

  • Calculate the hashcode of the key by calling hashcode() method.
  • Pass the calculated hashcode to an internal hash function to obtain the index of the table.
  • Iterate through the linked list present at the index and call the equals() method to find a matching key.

Now you might understand why it is important to have the knowledge on equals() and hashcode() method. If these two methods are not implemented properly, the HashMap or any such collection for that matter may not work properly.

Improvements in Java 8

In Java 8, there is a performance improvement for HashMap. When there are lot of hash collisions in the keys(different keys end up with same hash value or index), balanced trees will be used to store Entry objects rather than linked list. The  idea is that once the number of items in a hash bucket grows beyond a certain threshold, the bucket will switch from using a linked list of entries to a balanced tree.

Useful Links:

https://docs.oracle.com/javase/tutorial/collections/interfaces/map.html

https://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html

Thanks for reading.

The following two tabs change content below.
Working as a Java developer since 2010. Passionate about programming in Java. I am a part time blogger.
One comment
  1. very simple explination on How HashMap works in Java
    to stores the data in the form of key-value pairs and find the exact index from bucket for key value pair.

Add Comment

Required fields are marked *. Your email address will not be published.