Hashing in DBMS

Learn via video course
FREE
View all courses
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
Topics Covered

Overview

Hashing is an advantageous technique that tells the exact location of the data using a hash function. For quicker retrieval of data in DBMS hashing technique is vastly used as it does not use the index structure to find the location of desired data.

Hashing in DBMS is classified into two types viz. Static hashing and Dynamic hashing.

What is Hashing in DBMS ?

In huge databases it is very impractical to search all the index values to get the desired data as searching all the index values through all the levels is not an efficient way to reach destination data block to retrieve the desired data.

Therefore, hashing in DBMS is an effective technique used to directly search the location of data without using index structure. It uses a hash function (a mathematical function) to find the exact location of a record in the minimum amount of time.

Before diving deep into the concept, let's have a look at the important terminologies which will be frequently used in the article.

Important Terminologies

  • Transform Your Career

    Choose from our industry-leading programs designed for career success

    NSDC Certified

    Modern Software and AI Engineering Program

    Master full-stack development with AI integration

    12 MonthsDuration
    AI-LedCurriculum
    Career SupportSupport
    GoogleAmazonPaytm+1000 more
    Go to Program
    NSDC Certified

    Modern Data Science and ML with specialisation in AI

    Advanced data science techniques with AI specialization

    12 MonthsDuration
    AI-LedCurriculum
    Career SupportSupport
    GoogleAmazonPaytm+1000 more
    Go to Program
    NSDC Certified

    Advanced AIML with Specialisation in Agentic AI

    Deep dive into AIML with focus on Agentic systems

    12 MonthsDuration
    AI-LedCurriculum
    Career SupportSupport
    GoogleAmazonPaytm+1000 more
    Go to Program
    NSDC Certified

    DevOps, Cloud & AI Platform Engineering

    Build and manage AI-powered cloud infrastructure

    12 MonthsDuration
    AI-LedCurriculum
    Career SupportSupport
    GoogleAmazonPaytm+1000 more
    Go to Program
    NSDC Certified

    AI Engineering Advanced Certification by IIT-Roorkee

    Premier AI engineering certification from IIT-Roorkee

    3 MonthsDuration
    AI-LedCurriculum
    Career SupportSupport
    Program highlights
    Go to Program

    Data Bucket

    Data buckets are the memory locations where the actual data records are stored.
  • Hash function

    A mathematical function that returns the value of the address of the desired data block. Mostly, it uses the primary key to generate the address of a data block.
  • Hash Index

    It denotes the address of a data block generated by the hash function.
  • Turn Learning into Career Growth

    1200+Hiring Partners
    89%Placement Rate
    11,000+Placements
    147%Avg Salary Increment
    2.5XCareer Growth
    ₹23 LPAAvg Post-Scaler Salary
    1200+Hiring Partners
    89%Placement Rate
    11,000+Placements
    147%Avg Salary Increment
    2.5XCareer Growth
    ₹23 LPAAvg Post-Scaler Salary

    Linear Probing

    It is used when a record already exists on the memory location calculated by Hash Function. It linearly searches for an empty bucket until it is found.
  • Scaler Placement Report and Statistics

    ₹23L
    AVG CTC
    SCALER PLACEMENT PROOF

    Scaler learners achieved 2.5x salary growth with average post-Scaler CTC reaching ₹23L.

    11,000+placements
    650+companies
    Verified data

    Quadratic Probing

    It is similar to linear probing with the only difference as it uses an arbitrary polynomial function to search for the next empty bucket instead of searching linearly.
  • Bucket Overflow

    The condition when the memory address generated by the hash function is not empty(i.e.i.e. it already contains some data record) is known as bucket overflow as shown in the image below -

image1

Types of Hashing in DBMS

Hashing techniques can be further divided into two types, i.e.,i.e., static hashing and dynamic hashing.

Types of hashing in dbms

Static Hashing

In static hashing, for a key kk, hash function h(x)h(x) always generates the same memory address. For example, if h(x)=x%7h(x)=x\%7 then for any xx the value will always be the same.

Thus if we generate the address for a key 107107, then h(x)h(x) will always return the same bucket address, i.e.,i.e., 3. There will be no change in the bucket address. Hence the number of buckets in the memory always remains constant. static hashing

Operations in Static Hashing

  • Insertion - To insert a new record in the table we will calculate the bucket address for search key kk, and the record will be stored there.
  • Search - To search a record we will use the same hash function to get the bucket address of the record and retrieve the data from there.
  • Deletion - To delete a record from the table we will firstly search for that record and then simply delete it from that location.

Static hashing can also be further divided to open and closed hashing.

Open Hashing

In open hashing, whenever a collision occurs, we probe for the next empty bucket to enter the new record. The order of checking are majorly of two types -

  • Linear Probing
  • Quadratic Probing

For example -

Closed Hashing

In closed hashing, the collision condition is handled by linking the new record after the previous one, due to which is also termed as "Hashing with separate chaining". For example - closed hashing

Dynamic Hashing

The main disadvantage of static hashing is that it does not expand or shrink as the size of the database expands or shrinks.

To counter this problem, we use dynamic hashing which allows on-demand addition and removal of data buckets. It is also known as _extended hashing_. Since this method is dynamic in nature, it always allows efficient insertion and deletion operations.

For example -

Let's say we have to insert records with the following keys as shown in the image given below along with their hash index in binary form. Dynamic hashing

Now, if the global depth kk is 2 then we will select two LSBs of the hash index. Therefore the obtained values are 01,11,00,11,1001,11,00,11,10.

Now we will insert the records accordingly in the buckets with their respective ids as shown below -

Dynamic hashing example

Before proceeding with the operations of dynamic hashing let us define two more terminologies -

  • Directories - These are the containers that store pointers to the buckets. Each bucket has an "id" associated with them.
  • Global depth - Global depth can be defined as the "number of bits in directory id".

Operation in Dynamic Hashing

  • Search -

    • Calculate hash address of the record based on the primary key of the record.
    • Convert hash address into binary form, let's say we have 13 as hash address then its binary form is 11011101.
    • Select kk LSBs from the binary form, where kk is the global depth of the directory. If k=3k=3 then we select 1101.
    • Now navigate to the bucket with directory id 101101 and search the record.
  • Insertion -

    • Repeat the first three steps of the searching procedure, i.e.,i.e., getting and converting the hash address in binary form and then selecting kk LSBs from it.
    • Now, go to the bucket with the obtained address and insert the record there.
  • Deletion -

    • Again repeat the first 3 steps of the searching procedure.
    • Now go to the bucket with the obtained address and delete the record from there.

Conclusion

  • Hashing is an efficient process used in DBMS for quick retrieval of the data.

  • There are two types of hashing in DBMS, i.e.,i.e., Static hashing and Dynamic hashing.

  • Static hashing can be further classified to open hashing and closed hashing.

  • Dynamic hashing is more advantageous than static hashing because it can expand or shrink with the size of the database.

Read More:

Hiring Partners:
GoogleGoogleAmazonAmazonMicrosoftMicrosoftFlipkartFlipkartAdobeAdobe1200+ more