SQL SELECT DISTINCT Statement

Video Tutorial
FREE
Distinct thumbnail
This video belongs to
DBMS Course - Master the Fundamentals and Advanced Concepts
16 modules
Certificate
Topics Covered

Overview

DISTINCT keyword in SQL eliminates all duplicate records from the result returned by the SQL query. The DISTINCT keyword is used in combination with the SELECT statement. Only unique records are returned when the DISTINCT keyword is used while fetching records from a table having multiple duplicate records.

Introduction to DISTINCT in SQL

Imagine encountering a scenario where you have to work with tables containing numerous duplicate entries, but you need to extract only the unique ones for your subsequent operations. In such a situation, the SQL DISTINCT keyword can be immensely helpful. With this keyword, you can eliminate any duplicate values and retrieve only the unique ones. The DISTINCT keyword operates on a single column, identifying and returning distinct values by removing all duplicate records from the table.

DISTINCT can also be used along with aggregate SQL functions like COUNT, MAX, SUM, AVG, etc. DISTINCT operates not only on a single column of a table but also has support for multiple columns of a table, where DISTINCT in SQL will eliminate those rows where all the selected columns are identical.

Note: In case NULL values are present in a particular table column, using the DISTINCT clause will also include NULL as a distinct record in the result.

Syntax of DISTINCT in SQL

In the above syntax, DISTINCT is used after SELECT statement followed by the name of the column of the table on which we want to apply the DISTINCT clause, which is then followed by an optional WHERE condition that includes any type of filtering needed to be done before printing out the result.

Parameters of DISTINCT in SQL

A DISTINCT clause in SQL can be applied to any valid SELECT query, where it will filter out all rows that are not unique in terms of all selected columns.

column_name: name of the column on which we want to apply the DISTINCT clause.

table_name: name of the table from which we want to retrieve the records.

WHERE condition: it is an optional statement used while writing a SQL query to satisfy the defined conditions while fetching records.

How to Use DISTINCT in SQL?

Let us consider the following Students table, containing the roll no, name, age, address, and course name in which a student is enrolled.

Roll NoNameAgeAddressCourse
101Nobita18JapanPhysics
102Suneo16AmericaAerospace
103Shizuka18JapanChemistry
104Gian23KoreaMaths
105Kiteretsu22LondonGeology
106Kenichi19SingaporeEnglish
107Mioko22AustraliaBiology

First, write a SQL query to return all student's ages, including duplicate values.

Output

age
18
16
18
23
22
19
22

The above SQL query returns the age of every student, including duplicate values, i.e., 18 and 22, since 18 and 22 occur twice in the Age column of the student's table. To remove these duplicate age values, we can use DISTINCT in the SQL clause before the column name in combination with the SELECT query.

Output

age
18
16
23
22
19

The above SQL query returns only the unique ages from the student's table. Both duplicate values 18 and 22 are not returned because the DISTINCT clause eliminates all the duplicate values from the output.

Examples of DISTINCT in SQL

1. Example of Finding Unique Values in a Single Column

Consider the following Companies table, which contains the company name and location of their headquarters.

S NoNameStateCountry
1MicrosoftWashingtonUSA
2BMWMunichGermany
3WalmartArkansasUSA
4VodafoneLondonUK
5AccentureDublinIreland
6NissanYokohamaJapan
7EricssonStockholmSweden
8GodrejMumbaiIndia
9BarclaysLondonUK
10NikonTokyoJapan

Writing SQL query to find unique headquarter countries from the Companies table.

Output

country
USA
Germany
UK
Ireland
Japan
Sweden
India

The above SQL query uses a DISTINCT clause that filters out all duplicate country names from the output and only contains all the unique country names from the Companies table.

2. Example of Finding Unique Values in Multiple Columns

Consider the following Companies table, which contains the company names and the location of their headquarters.

S NoNameStateCountry
1MicrosoftWashingtonUSA
2BMWMunichGermany
3WalmartArkansasUSA
4VodafoneLondonUK
5AccentureDublinIreland
6HitachiTokyoJapan
7EricssonStockholmSweden
8GodrejMumbaiIndia
9BarclaysLondonUK
10NikonTokyoJapan

Now, write DISTINCT in SQL query to find unique state and country combinations from the Companies table.

Output

statecountry
WashingtonUSA
MunichGermany
ArkansasUSA
LondonUK
DublinIreland
TokyoJapan
StockholmSweden
MumbaiIndia

The above output contains all unique state and country combinations from the Companies table. Here we get only eight records as a result because (London, UK) and (Tokyo, Japan) are duplicate combinations of state and country.

3. Example of Handling NULL Using DISTINCT Clause

Consider the following Students table, which contains the roll no, names, age, and course in which a student is enrolled.

Roll NoNameAgeCourse
101Nobita18Chemistry
102Suneo16NULL
103Shizuka18Chemistry
104Gian23Maths
105Kiteretsu22Biology
106Kenichi19English
107Mioko22Biology

Now, writing DISTINCT in SQL query to find unique courses.

Output

course
Chemistry
NULL
Maths
Biology
English

In the above SQL query, the DISTINCT clause treats NULL as a value in the Course column of the table, which means that if there are two NULLs in the same column, they are interpreted as the same/duplicate value. Therefore, if the SELECT statement returns NULL multiple times, the DISTINCT will return only one NULL.

Since the DISTINCT in SQL clause doesn't ignore NULL values. Therefore, the output contains all unique course names in which students are enrolled, including NULL, as given in the Students table.

Difference between DISTINCT and GROUP BY

The DISTINCT clause in SQL filters out all duplicate records and returns unique ones. In comparison, GROUP BY is majorly used for aggregating and grouping rows. GROUP BY can also be used to filter unique records but in a little more complex manner than DISTINCT in SQL.

Consider the following Student table having duplicate records.

Roll NoNameAgeAddressCourse
101Nobita18JapanPhysics
102Suneo16AmericaAerospace
103Shizuka18JapanChemistry
104Gian23KoreaMaths
105Kiteretsu22LondonGeology
106Kenichi19AmericaEnglish
107Mioko22AustraliaBiology

Using DISTINCT in SQL to get unique addresses from the table.

Output

Address
Japan
America
Korea
London
Australia

The above SQL query uses the DISTINCT clause on the Address column of the table to return only the unique Address values by eliminating the duplicate ones from the Students table.

Now use GROUP BY to produce the same output along with the count of each address in the table.

Output

Addressaddress_count
Japan2
America2
Korea1
London1
Australia1

In the above SQL query, we have grouped our output by the Address column that only returns the unique Address values and the number of times it exists in the Students table.

Conclusion

  • DISTINCT keyword in SQL is used in conjunction with the SELECT statement. Unique records are returned when the DISTINCT keyword is used while fetching records from a table having multiple duplicate records.
  • DISTINCT in SQL operates only on a single column. It does not have support for multiple columns.
  • GROUP BY is also used for fetching unique records, but the main difference between DISTINCT and GROUP BY is that the latter is used for aggregating and grouping rows that help summarize a particular column of a table.
  • For example, the DISTINCT clause in SQL is widely used in a School Management Database to find out the unique city names of students in the school.