Decomposition in DBMS

Video Tutorial
FREE
 Decompositions with solved problems thumbnail
This video belongs to
DBMS Course - Master the Fundamentals and Advanced Concepts
16 modules
Certificate
Topics Covered

Decomposition in DBMS involves dividing a table into multiple tables, aiming to eradicate redundancy, inconsistencies, and anomalies. This process, represented as {X1, X2,……Xn}, ensures dependency preservation and losslessness. When a relational model's relation lacks appropriate normal form, decomposition becomes necessary to address issues like information loss, anomalies, and redundancy, ultimately enhancing the overall design quality and efficiency of the database.

There are two types of decomposition as shown below:

Type of Decomposition

Rules for Decomposition

Whenever we decompose a relation, there are certain properties that must be satisfied to ensure no information is lost while decomposing the relations. These properties are:

  1. Lossless Join Decomposition.
  2. Dependency Preserving.

Lossless Join Decomposition

A lossless Join decomposition ensures two things:

  • No information is lost while decomposing from the original relation.
  • If we join back the sub decomposed relations, the same relation that was decomposed is obtained.

We can follow certain rules to ensure that the decomposition is a lossless join decomposition Let’s say we have a relation R and we decomposed it into R1 and R2, then the rules are:

  1. The union of attributes of both the sub relations R1 and R2 must contain all the attributes of original relation R.

    R1 ∪ R2 = R

  2. The intersection of attributes of both the sub relations R1 and R2 must not be null, i.e., there should be some attributes that are present in both R1 and R2.

    R1 ∩ R2 ≠ ∅

  3. The intersection of attributes of both the sub relations R1 and R2 must be the superkey of R1 or R2, or both R1 and R2.

    R1 ∩ R2 = Super key of R1 or R2

Let’s see an example of a lossless join decomposition. Suppose we have the following relation EmployeeProjectDetail as:

<EmployeeProjectDetail>

Employee_CodeEmployee_NameEmployee_EmailProject_NameProject_ID
101Johnjohn@demo.comProject103P03
101Johnjohn@demo.comProject101P01
102Ryanryan@example.comProject102P02
103Stephaniestephanie@abc.comProject102P02

Now, we decompose this relation into EmployeeProject and ProjectDetail relations as:

<EmployeeProject>

Employee_CodeProject_IDEmployee_NameEmployee_Email
101P03Johnjohn@demo.com
101P01Johnjohn@demo.com
102P04Ryanryan@example.com
103P02Stephaniestephanie@abc.com

The primary key of the above relation is {Employee_Code, Project_ID}.

<ProjectDetail>

Project_IDProject_Name
P03Project103
P01Project101
P04Project104
P02Project102

The primary key of the above relation is {Project_ID}.

Now, let’s see if this is a lossless join decomposition by evaluating the rules discussed above:

Let’s first check the EmployeeProject ProjectDetail:

<EmployeeProject ProjectDetail>

Employee_CodeProject_IDEmployee_NameEmployee_EmailProject_Name
101P03Johnjohn@demo.comProject103
101P01Johnjohn@demo.comProject101
102P04Ryanryan@example.comProject104
103P02Stephaniestephanie@abc.comProject102

As we can see all the attributes of EmployeeProject and ProjectDetail are in EmployeeProject ProjectDetail relation and it is the same as the original relation. So the first condition holds.

Now let’s check the EmployeeProject ProjectDetail:

<EmployeeProject ProjectDetail>

Project_ID
P03
P01
P04
P02

As we can see this is not null, so the the second condition holds as well. Also the EmployeeProject ∩ ProjectDetail = Project_Id. This is the super key of the ProjectDetail relation, so the third condition holds as well.

Now, since all three conditions hold for our decomposition, this is a lossless join decomposition.

Lossless vs Lossy Decomposition

In a lossy decomposition, one or more of these conditions would fail and we will not be able to recover Complete information as present in the original relation. For example, let's say we decompose our original relation EmployeeProjectDetail into EmployeeProject and ProjectDetail relations as:

<EmployeeProject>

Employee_CodeEmployee_NameEmployee_Email
101Johnjohn@demo.com
102Ryanryan@example.com
103Stephaniestephanie@abc.com

The primary key of the above relation is {Employee_Code}.

<ProjectDetail>

Project_IDProject_Name
P03Project103
P01Project101
P04Project104
P02Project102

The primary key of the above relation is {Project_ID}.

Now, the intersection EmployeeProject ProjectDetail is null. Therefore there is no way for us to map a project to its employees. Thus this is a lossy decomposition.

Dependency Preserving

The second property of lossless decomposition is dependency preservation which says that after decomposing a relation R into R1 and R2, all dependencies of the original relation R must be present either in R1 or R2 or they must be derivable using the combination of functional dependencies present in R1 and R2.

Let’s understand this from the same example above:

<EmployeeProjectDetail>

Employee_CodeEmployee_NameEmployee_EmailProject_NameProject_ID
101Johnjohn@demo.comProject103P03
101Johnjohn@demo.comProject101P01
102Ryanryan@example.comProject104P04
103Stephaniestephanie@abc.comProject102P02

In this relation we have the following FDs:

  • Employee_Code -> {Employee_Name, Employee_Email}
  • Project_ID - > Project_Name

Now, after decomposing the relation into EmployeeProject and ProjectDetail as:

<EmployeeProject>

Employee_CodeProject_IDEmployee_NameEmployee_Email
101P03Johnjohn@demo.com
101P01Johnjohn@demo.com
102P04Ryanryan@example.com
103P02Stephaniestephanie@abc.com

In this relation we have the following FDs:

  • Employee_Code -> {Employee_Name, Employee_Email}

<ProjectDetail>

Project_IDProject_Name
P03Project103
P01Project101
P04Project104
P02Project102

In this relation we have the following FDs:

  • Project_ID - > Project_Name

As we can see that all FDs in EmployeeProjectDetail are either part of the EmployeeProject or the ProjectDetail, So this decomposition is dependency preserving.

Conclusion

  • Decomposition is the process of breaking an original relation into multiple sub relations.
  • Decomposition helps to remove anomalies, redundancy, and other problems in a DBMS.
  • Decomposition can be lossy or lossless.
  • An ideal decomposition should be lossless join decomposition and dependency preserving.

Read More: