What is Hash Collision?
Hash collisions have negative impact on the lookup time of HashMap.
When multiple keys end up in the same bucket, then values along with their keys are placed in a linked list.
In case of retrieval, linked list has to be traversed to get the entry. In worst case scenario, when all keys are mapped to the same bucket, the lookup time of HashMap increases from O(1) to O(n).
A collision occurs when a hash function returns the same bucket location for two different keys.
How Hash Collision is handled?
HashMap handles collision by using a linked list to store map entries ended up in same array location or bucket location.
From Java 8 onwards, HashMap, ConcurrentHashMap, and LinkedHashMap will use the balanced tree in place of linked list to handle frequently hash collisions. The idea is to switch to the balanced tree once the number of items in a hash bucket grows beyond a certain threshold.
Above changes ensure performance of O(log(n)) in worst case scenarios (hash function is not distributing keys properly) and O(1) with proper hashCode().
The threshold of switching to the balanced tree is defined as TREEIFY_THRESHOLD constant in java.util.HashMap JDK 8 code. Currently, its values are 8, which means if there are more than 8 elements in the same bucket then HashMap will use a tree instead of a linked list to hold them in the same bucket.
Correlation between performance, hash collision & loading factor. Lower load factor = more free buckets = less chances of collision = high performance = high space requirement. Higher load factor = fewer free buckets = higher chance of collision = lower performance = lower space requirement
How linked list is replaced with balanced tree?
In Java 8, HashMap replaces linked list with a balanced tree when the number of elements in a bucket reaches certain threshold. While converting the list to balanced tree, hashcode is used as a branching variable.
If there are two different hashcodes in the same bucket, one is considered bigger and goes to the right of the tree and other one to the left. But when both the hashcodes are equal, HashMap assumes that the keys are comparable, and compares the key to determine the direction so that some order can be maintained. It is a good practice to make the keys of HashMap comparable.
This JDK 8 change applies only to HashMap, LinkedHashMap and ConcurrentHashMap.
Post a Comment