Thursday, April 18, 2019

Understanding Internal Working of java.util.HashMap


Hello Friends, Today in this blog we will be exploring the internal working of HashMap collection classes. HashMap is an implementation of the Map interface and allows us to store & retrieve values in the key-value pattern. The internal structure of the Hashmap is designed in such an efficient way that it will give you the best performance for searching values in Map. Let's get started to gain more insights over the HashMap implementation.


Outline of the Tutorial
  • The basic principle of HashMap
  • What happens when you insert an element in hashmap
  • What happens when you retrieve an element in hashmap
  • Example


Basic Principal of HashMap
HashMap is the most popular data structure where we will be having multiple buckets. Each bucket can store multiple objects in a linked list format. Whenever an object needs to be saved, it will find out the hash value for the object. Based on hash value, HashMap will search for the bucket in which this object should go & the object is inserted in the bucket.

The hash value is the 32-bit integer number which represents the object uniqueness. Not necessary that if two objects have the same hash code will be equals. but the opposite will be true like if two objects are equals then their hashcode must be equals.


What happens when you insert an element in HashMap
  • Key Validation & Hashcode generation
    • If the key is null, then hash for the key will be 0
    • If the key is not null, then the hash of the key is calculated
  • Bucket Identification
    • Based on hash value, it identifies the bucket in which this element should go. Buckets are nothing but the Array of the Linked List with an initial size of 16 and having name table.
    • The basic technique to find bucket number is to take the "Binary AND" operation of the hash value with the size of the table say "n". It will give you a number between 0 to "n"; This is the bucket number.
  • Node Insertion
    • If there is no element in the table at that location. i.e we can say bucket. The first node is created called the leaf node. Every node has key, value, hash value & reference to the next node. values for each field are set and reference to the next node is set to null.
    • If there is already an element in the table at that location, the linked list get traversed till the end of the linked list. Here new node is created and it gets attached to the last element in the linked list.
Due to this algorithm, there is slight performance degradation while inserting an element, but it will create a solid data structure from which you can retrieve any value performantly.  So you could have understood the above process, but we will go through the eclipse screenshots in the below section, which gives you an idea of how these things are work when are in action.


What happens when you ask for the value of a key in HashMap
  • Key Validation & Hashcode generation
    • If the key is null, then hash for the key will be 0
    • If the key is not null, then the hash of the key is calculated
  • Bucket Identification
    • Based on hash value, it identifies the bucket in which this element is present
  • Search For the Element In Bucket
    • Compare the hash value with the hash value of the first Node & key with the key of the first node, if matched then return the value
    • If Above condition fails, Iterate through the linked list till we get an element in the linked list who's next reference is null.
    • While traversing, compare the hash value of the node first if that matched then compare the key. if the hash value & key are matched it will return value otherwise null.


Let's see the hashmap in action
I have a simple example where we put the value to HashMap & retrieve it from there using keys. here I have put the eclipse IDE screenshot where you can understand the flow.


1. Firstly when we have initialized the HashMap, You can see the table is empty.


2. Here we have inserted our first element in the hashmap. Here you can see the table is initialized with the 16 array element. each of them is a bucket for the hashmap. The element with key "one" is inserted into the 7th bucket. I have highlighted it with a red color square.


3. The hash value for the key is calculated based on the below calculation.
4. Below is the calculation which decides in which bucket the element should go. As per output, the element should go to 7th bucket.


5. Here we have inserted the 2nd element in the map. the same calculation is applied here for the hash value computation & bucket identification.


6. Here 3rd element is inserted in the 13th bucket, Here you can see 13th bucket already have an element, so our 3rd attached to the next node of the last element in the 13th bucket (i.e. element with key "two").


7. While retrieving these elements, the same process is applied, hash value generation & bucket identification. below computation will determine where the values should be looked for. I mean in which bucket it should be looked for.




I hope you have understood the basic working of hashmap and its internal. Please provide your suggestion/queries and feedback in the comment section below.

Have a wonderful day & Happy Coding !!!
Anil Sable

0 comments:

Post a Comment