SQL SELECT DISTINCT Statement

Overview

DISTINCT keyword in SQL eliminates all duplicate records from the result returned by the SQL query. The DISTINCT keyword is used in combination with the SELECT statement. Only unique records are returned when the DISTINCT keyword is used while fetching records from a table having multiple duplicate records.

Introduction to DISTINCT in SQL

Imagine encountering a scenario where you have to work with tables containing numerous duplicate entries, but you need to extract only the unique ones for your subsequent operations. In such a situation, the SQL DISTINCT keyword can be immensely helpful. With this keyword, you can eliminate any duplicate values and retrieve only the unique ones. The DISTINCT keyword operates on a single column, identifying and returning distinct values by removing all duplicate records from the table.

DISTINCT can also be used along with aggregate SQL functions like COUNT, MAX, SUM, AVG, etc. DISTINCT operates not only on a single column of a table but also has support for multiple columns of a table, where DISTINCT in SQL will eliminate those rows where all the selected columns are identical.

Note: In case NULL values are present in a particular table column, using the DISTINCT clause will also include NULL as a distinct record in the result.

Syntax of DISTINCT in SQL

In the above syntax, DISTINCT is used after SELECT statement followed by the name of the column of the table on which we want to apply the DISTINCT clause, which is then followed by an optional WHERE condition that includes any type of filtering needed to be done before printing out the result.

Parameters of DISTINCT in SQL

A DISTINCT clause in SQL can be applied to any valid SELECT query, where it will filter out all rows that are not unique in terms of all selected columns.

column_name: name of the column on which we want to apply the DISTINCT clause.

table_name: name of the table from which we want to retrieve the records.

WHERE condition: it is an optional statement used while writing a SQL query to satisfy the defined conditions while fetching records.

How to Use DISTINCT in SQL?

Let us consider the following Students table, containing the roll no, name, age, address, and course name in which a student is enrolled.

Roll No	Name	Age	Address	Course
101	Nobita	18	Japan	Physics
102	Suneo	16	America	Aerospace
103	Shizuka	18	Japan	Chemistry
104	Gian	23	Korea	Maths
105	Kiteretsu	22	London	Geology
106	Kenichi	19	Singapore	English
107	Mioko	22	Australia	Biology

First, write a SQL query to return all student's ages, including duplicate values.

Output

age
18
16
18
23
22
19
22

The above SQL query returns the age of every student, including duplicate values, i.e., 18 and 22, since 18 and 22 occur twice in the Age column of the student's table. To remove these duplicate age values, we can use DISTINCT in the SQL clause before the column name in combination with the SELECT query.

Output

age
18
16
23
22
19

The above SQL query returns only the unique ages from the student's table. Both duplicate values 18 and 22 are not returned because the DISTINCT clause eliminates all the duplicate values from the output.

Examples of DISTINCT in SQL

1. Example of Finding Unique Values in a Single Column

Consider the following Companies table, which contains the company name and location of their headquarters.

S No	Name	State	Country
1	Microsoft	Washington	USA
2	BMW	Munich	Germany
3	Walmart	Arkansas	USA
4	Vodafone	London	UK
5	Accenture	Dublin	Ireland
6	Nissan	Yokohama	Japan
7	Ericsson	Stockholm	Sweden
8	Godrej	Mumbai	India
9	Barclays	London	UK
10	Nikon	Tokyo	Japan

Writing SQL query to find unique headquarter countries from the Companies table.

Output

country
USA
Germany
UK
Ireland
Japan
Sweden
India

The above SQL query uses a DISTINCT clause that filters out all duplicate country names from the output and only contains all the unique country names from the Companies table.

2. Example of Finding Unique Values in Multiple Columns

Consider the following Companies table, which contains the company names and the location of their headquarters.

S No	Name	State	Country
1	Microsoft	Washington	USA
2	BMW	Munich	Germany
3	Walmart	Arkansas	USA
4	Vodafone	London	UK
5	Accenture	Dublin	Ireland
6	Hitachi	Tokyo	Japan
7	Ericsson	Stockholm	Sweden
8	Godrej	Mumbai	India
9	Barclays	London	UK
10	Nikon	Tokyo	Japan

Now, write DISTINCT in SQL query to find unique state and country combinations from the Companies table.

Output

state	country
Washington	USA
Munich	Germany
Arkansas	USA
London	UK
Dublin	Ireland
Tokyo	Japan
Stockholm	Sweden
Mumbai	India

The above output contains all unique state and country combinations from the Companies table. Here we get only eight records as a result because (London, UK) and (Tokyo, Japan) are duplicate combinations of state and country.

3. Example of Handling NULL Using DISTINCT Clause

Consider the following Students table, which contains the roll no, names, age, and course in which a student is enrolled.

Roll No	Name	Age	Course
101	Nobita	18	Chemistry
102	Suneo	16	NULL
103	Shizuka	18	Chemistry
104	Gian	23	Maths
105	Kiteretsu	22	Biology
106	Kenichi	19	English
107	Mioko	22	Biology

Now, writing DISTINCT in SQL query to find unique courses.

Output

course
Chemistry
NULL
Maths
Biology
English

In the above SQL query, the DISTINCT clause treats NULL as a value in the Course column of the table, which means that if there are two NULLs in the same column, they are interpreted as the same/duplicate value. Therefore, if the SELECT statement returns NULL multiple times, the DISTINCT will return only one NULL.

Since the DISTINCT in SQL clause doesn't ignore NULL values. Therefore, the output contains all unique course names in which students are enrolled, including NULL, as given in the Students table.

Difference between DISTINCT and GROUP BY

The DISTINCT clause in SQL filters out all duplicate records and returns unique ones. In comparison, GROUP BY is majorly used for aggregating and grouping rows. GROUP BY can also be used to filter unique records but in a little more complex manner than DISTINCT in SQL.

Consider the following Student table having duplicate records.

Roll No	Name	Age	Address	Course
101	Nobita	18	Japan	Physics
102	Suneo	16	America	Aerospace
103	Shizuka	18	Japan	Chemistry
104	Gian	23	Korea	Maths
105	Kiteretsu	22	London	Geology
106	Kenichi	19	America	English
107	Mioko	22	Australia	Biology

Using DISTINCT in SQL to get unique addresses from the table.

Output

Address
Japan
America
Korea
London
Australia

The above SQL query uses the DISTINCT clause on the Address column of the table to return only the unique Address values by eliminating the duplicate ones from the Students table.

Now use GROUP BY to produce the same output along with the count of each address in the table.

Output

Address	address_count
Japan	2
America	2
Korea	1
London	1
Australia	1

In the above SQL query, we have grouped our output by the Address column that only returns the unique Address values and the number of times it exists in the Students table.

Conclusion

DISTINCT keyword in SQL is used in conjunction with the SELECT statement. Unique records are returned when the DISTINCT keyword is used while fetching records from a table having multiple duplicate records.
DISTINCT in SQL operates only on a single column. It does not have support for multiple columns.
GROUP BY is also used for fetching unique records, but the main difference between DISTINCT and GROUP BY is that the latter is used for aggregating and grouping rows that help summarize a particular column of a table.
For example, the DISTINCT clause in SQL is widely used in a School Management Database to find out the unique city names of students in the school.