What is the Dirty Read Problem in DBMS?

Learn via video course
FREE
View all courses
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
Topics Covered

To maximize efficiency, we process the transactions concurrently. However, the issue arises when more than one transaction performs certain read/write operations on a particular data.

As the name suggests, when some dirty data is read from the database it is known as the Dirty Read.

More formally, when a transaction (say X) is reading a row that has been modified by another transaction (say Y) but not committed yet lead to the condition of Dirty Read. Click here to learn more about Transaction in DBMS.

Example: Dirty Read Problem

Consider two people A and B are trying to book a train ticket from the IRCTC platform. Let's suppose they perform the following sequence of events -

TimeAB
T1T_1READ(SEATS)........
T2T_2SEATS = SEATS - 1......
T3T_3WRITE(SEATS)......
T4T_4........READ(SEATS)
T5T_5........COMMIT
T6T_6ROLLBACK......

Now let's say initially (before A and B started the booking process) 4 seats were available. Then, A came and read SEATS as 4, then he started the booking process, and hence he decreased the value of SEATS to 3. Now, B came and read the count of seats i.e. SEATS to be 3. But after some time the transaction of A got ROLLBACKED (say due to an issue with the payment gateway). Then the value of SEATS will get updated to its initial value i.e. 4. And thus, we can say B has read a Dirty value of SEATS.

Example: Overcome the Dirty Read Concurrency Problem

We have the following concurrency control protocols that we can implement to overcome the Dirty read problem along with other concurrency problems that may arise -

  • Lock-Based Protocols - To attain consistency, isolation between the transactions is the most important tool. Isolation is achieved if we disable the transaction to perform a read/write operation. This is known as locking an operation in a transaction. Through lock-based protocols, desired operations are freely allowed to perform locking the undesired operations.

  • Time-Based Protocols - According to this protocol, every transaction has a timestamp attached to it. The timestamp is based on the time in which the transaction is entered into the system. There are read and write timestamps associated with every transaction which consist of the time at which the latest read and write operations are performed respectively.

  • Validation Based Protocols - In this protocol, we have certain phases i.e. Reading Phase, Validation Phase, and Validation Test Phase during each of which we undergo certain commands to make sure that the Dirty read problem should not occur.

We would strongly recommend getting more detailed insights about the concurrency problems in DBMS and the known solutions to them from our article on Concurrency control in DBMS.

Conclusion

  • A dirty read problem may arise when the number of transactions (that perform on shared data) executes simultaneously.
  • It occurs when a transaction reads the data that has been updated by another transaction that is still uncommitted.
  • Due to some reason a transaction gets rollbacked and therefore another transaction processes with dirty data.
  • To avoid the Dirty Read problem, certain rules are made for the execution of concurrent transactions. These rules are known as concurrency control protocols.
  • We have the following types of protocols to deal with the dirty read problem -
    • Lock-Based Protocols
    • Time-Based Protocols
    • Validation-Based Protocols