Double Hashing
Overview
Double Hashing is a computer programming technique used in conjunction with open addressing in hash tables to resolve hash collisions, by using a secondary hash of the key as an offset when a collision occurs.
Takeaways
 Complexity of Double hashing algorithm
 Time complexity  O(n)
Introduction to Double Hashing
Have you ever spoken with a bank customer care executive? For any complaint or query, the first thing they ask you is your bank account number. What do you think? Why would they need your account number?
Yes, exactly! Your account number reveals all the information about you. For each of their customers, the bank maintains a keyvalue pair. The key is your account number and the value is your complete information. We can say that the bank has hashed your entire information in the form of an account number.
Hashing is a powerful technique used for storing and retrieving data in average constant time. In this technique, we store data or some keys in a fixedsize array structure known as a hashtable.
Each key is mapped to a specific location on the hashtable. This mapping is known as a hash function. An ideal hash function must map each unique key to a unique location on the hash table.
In most practical scenarios, the number of keys is way too large than the number of locations in the hash table, as a result, no matter how good the hash function is, collisions are bound to happen.
A collision is said to occur when two or more distinct keys are hashed to the same location of the hash table. To resolve these collisions, we need some kind of collision resolution techniques. One such effective collision resolution technique is Double Hashing.
What is Double Hashing?
In double hashing, we make use of two hash functions. The first hash function is $h_1$(k), this function takes in our key and gives out a location on the hashtable. If the new location is empty, we can easily place our key in there without ever using the secondary hash function.
However, in case of a collision, we need to use secondary hashfunction $h_2$(k) in combination with the first hashfunction $h_1$(k) to find a new location on the hashtable. The combined hashfunction used is of the form h(k,i) = ($h_1$(k) + i * $h_2$(k))%m. Here, i is an nonnegative integer which signifies the collision number, k = element/key which is being hashed and m= hash table size.
The use of secondary hashfunction $h_2$(k) after the collision, helps us to reach new locations on the hashtable, each new location is at a distance of $h_2$(k), 2*$h_2$(k), 3*$h_2$(k)….., this results in a nonlinear fashion of addressing hashtable which reduces the number of collisions.
A very important point to be considered is that both the hash functions should be computed in order of O(1) time.

In the diagram, we have a hashtable of size 5 and two keys 10, 15 which need to be inserted in the hashtable. We are using two hashfunction here, first hashfunction = $h_1$(key) = key % 5 and secondary hashfunction = $h_2$(key) = key % 7

We start with the first key 10 and apply our first hashfunction to get the location for 10 on the hashtable.
Location of key 10 = $h_1$(10) = 10 % 5 = $0^{th}$ location. Now the 0th location is empty and hence we can place our 10 there. No need to use the secondary hash function here. 
Now, we take our second key 15 and apply our first hash function again.
Location of key 15 = $h_1$(15) = 15 % 5 = $0^{th}$ location. Oops! This time the $0^{th}$ location is not empty, hence it is a collision. We need to find a new location for key 15, for this we use a secondary hashfunction.
New Location of key 15 = $h_1$(15) + i * $h_2$(15) = 0 + (1 * 15 % 7) = $1^{st}$ location. The first location of the hashtable is empty and we can place key 15 here.
Example of Double Hashing in Data Structure
The idea behind double hashing is fairly simple,
 Take the key you want to store on the hashtable.
 Apply the first hash function $h_1$(key) over your key to get the location to store the key.
 If the location is empty, place the key on that location.
 If the location is already filled, apply the second hash function $h_2$(key) in combination with the first hash function $h_1$(key) to get the new location for the key.
Let’s explore this idea in more detail, with a handson problem,
Problem Statement:
Insert the keys 79, 69, 98, 72, 14, 50 into the Hash Table of size 13. Resolve all collisions using Double Hashing where first hashfunction is $h_1$(k) = k mod 13 and second hashfunction is $h_2$(k) = 1 + (k mod 11)
Solution:
Initially, all the hash table locations are empty. We pick our first key = 79 and apply $h_1$(k) over it,
$h_1$(79) = 79 % 13 = 1, hence key 79 hashes to $1^{st}$ location of the hash table. But before placing the key, we need to make sure the $1^{st}$ location of the hash table is empty. In our case it is empty and we can easily place key 79 in there.
Second key = 69, we again apply $h_1$(k) over it, $h_1$(69) = 69 % 13 = 4, since the $4^{th}$ location is empty we can place 69 there.
Third key = 98, we apply $h_1$(k) over it, $h_1$(98) = 98 % 13 = 7, $7^{th}$ location is empty so 98 placed there.
Fourth key = 72, $h_1$(72) = 72 % 13 = 7, now this is a collision because the $7^{th}$ location is already occupied, we need to resolve this collision using double hashing.
$h_{new}$ = [ $h_1$(72) + i * $h_2$(72) ] % 13
= [ 7 + 1 * ( 1 + 72 % 11) ] % 13
= 1
Location $1^{st}$ is already occupied in the hashtable, hence we again had a collision. Now we again recalculate with i = 2
$h_{new}$ = [ $h_1$(72) + i * $h_2$(72) ] % 13
= [ 7 + 2 * ( 1 + 72 % 11) ] % 13
= 8
Location $8^{th}$ is empty in hashtable and now we can place key 72 in there.
Fifth key = 14, $h_1$(14) = 14%13 = 1, now this is again a collision because 1st location is already occupied, we now need to resolve this collision using double hashing.
$h_{new}$ = [ $h_1$(14) + i * $h_2$(14) ] % 13
= [ 1 + 1 * ( 1 + 14 % 11) ] % 13
= 5
Location $5^{th}$ is empty and we can easily place key 14 there.
Sixth key = 50, $h_1$(50) = 50%13 = 11, $11^{th}$ location is already empty and we can place our key 50 there.
Since all the keys are placed in our hash table the double hashing procedure is completed. Finally, our hash table looks like the following,
Why Use Double Hashing?
Double Hashing is one of the popular collision resolution techniques. The other popular variants which serve the same purpose are Linear Probing and Quadratic Probing. But if other techniques are available, then why do we need double hashing in the first place?
Double Hashing offers better resistance against clustering. A major reason for this is the use of dual functions. Dual functions enable double hashing to perform a more uniform distribution of keys, resulting in lower collisions.
Advantages of Double Hashing in Data Structure
 After each collision, we recompute the new location for the element in the hashtable. For successive collisions, this generates a sequence of locations. If the sequences are unique then the number of collisions are less.
 Double Hashing creates most unique sequences, providing a more uniform distribution of keys within the hashtable.
 If the hash function is not good enough, the elements tend to form grouping in the hashtable. This problem is known as clustering. Double Hashing is least prone to clustering.
Practice Problem Based on Double Hashing
Problem Statement 1:
Given the two hash functions, $h_1$(k) = k mod 23 and $h_2$(k) = 1 + k mod 19. Assume the table size is 23. Find the address returned by double hashing after 2nd collision for the key = 90.
Solution:
We will use the formula for double hashing
h(k,i) = ( $h_1$(k) + i * $h_2$(k) )%m
As it is given, k = 90, m = 23
Since the 2nd collision has already occurred, i = 2. Substituting the values in the above formula we get,
h(90,2) = [ ( 90 % 23 ) + 2 * ( 1 + 90 % 19 ) ] % 23
= [ 21 + 2 * 15 ] % 23
= 5
Hence after the second collision, the address returned by double hashing for Key = 90 is 5.
Problem Statement 2:
Given an Array[] of positive integers and a number n, find if there exists a pair of numbers in the array whose sum equals n. If such a pair exists, return true else false.
Example:

Input: A = [ 6, 4, 3, 1 ] and n = 7
Output: True, the pairs are {4,3} or {6,1}
Explanation: 4 + 3 = 7 or 6 + 1 = 7 
Input: A = [ 6, 4, 3, 1 ] and n = 8
Output: False, no such pair exist
Explanation: There is no pair of numbers in array, whose addition would sum up to 8
BEFORE MOVING TO A SOLUTION, JUST PAUSE AND PONDER
Solution:
The following JAVA code provides the solution.
 We maintain a hashset of integers which would store the complement of array numbers.
 The hashset ensures that each lookup can be completed in average O(1) time. It internally maintain’s socalled buckets and then tries to distribute our elements among these buckets depending upon the hashfunction.
 The core logic is to iteratively check our hashset whether it contains the desired complement number or not. For ex: If the sum was 8 and the first element of the array is 3 then in the hashset we check if 5 (8  3) exists or not.
 If such a complement is not present in the hashset, then we simply put the array number in the hashset and move forward.
 If such a complement exists, then we return true and stop.
Time Complexity of Double Hashing:
 Each lookup in the hashset costs O(1) constant time.
 In the worst case we might not find such a compliment and iterate through the entire array, this would cost us O(n) linear time.
 Rest operations like subtraction are mere constanttime operations.
 Hence total time complexity = O(n) time.
Space Complexity of Double Hashing:
 We need to maintain an extra hashset of size upto n elements which costs us extra O(n) space.
 Hence total space complexity = O(n) time.
Join our Data Structures in C++ Certification Course now and gain a competitive edge in your programming career!
Conclusion
 In this article, we demystified the cryptic world of Double Hashing. We performed handson with its working and applications.
 In the end, we exploited its benefits and performed searches in a constant average time.
 Please make sure to practice the following problems to test out your hashing concepts.