Subqueries in SQL

quiz
Challenge Inside! : Find out where you stand! Try quiz, solve problems & win rewards!

Overview

A subquery is a query which is used with the another SQL query embedded in the WHERE clause. Subqueries are majorly used in SELECT, INSERT, UPDATE, DELETE statements, FROM and WHERE clauses. The subqueries are used wherever an expression is allowed. The data returned by the subquery is used by the outer statement in the same way a literal value would be used. Subqueries in SQL provide an easy and efficient way to handle the queries that depend on the results from the another query. There are some rules which are to be followed while applying subqueries in SQL which are explained in this article.

Scope of the Article

  • This article covers introduction to subqueries in SQL, rules to be followed while using a subquery.
  • How to use subqueries with the SELECT and INSERT statement is explained with the examples.
  • The usage of subqueries with the UPDATE and DELETE statement is also covered with the examples.
  • You'll learn subqueries with the clauses in SQL like FROM clause and WHERE clause is covered with the examples.
  • How to use subqueries with the IN, NOT IN, EXISTS, NOT EXIST is covered with the examples of each.
  • There is a brief introduction to nested and correlated subqueries.

Introduction to Subqueries in SQL

Subqueries also called a nested query or inner query is a query within another SQL query. In SQL, it is possible to place a SQL query inside another query. For example,

SELECT * 
FROM Students
WHERE marks = (
    SELECT max(marks)
    FROM Students
);

Let's take a look at subquery and outer query of the above given example.

Example of subquer

In a subquery, the outer query's result is dependent on the result-set of the inner query. This is the reason why subqueries are also called nested queries. Let's take an example to understand the working of the subqueries.

Consider a table of Students on which the below given query is applied.

Students:

student_idnamemarkscountry
41Richa Shah26India
42Malina Joseph25England
43Kavish Shah23Canada
44Heta Ramya26UK
45Kevin Peterson23USA
SELECT * 
FROM Students
WHERE marks = (
    SELECT MIN(marks)
    FROM Students
);

The above query is applied to the given student's table. Here, the subquery is executed first which selects minimum marks from the student's table. Then this result is passed on to the WHERE clause of the outer query. Then the outer query is executed which selects rows where marks is equal to the result of the subquery and returns the final result.

The output of the above SQL query is given below:

| 43 | Kavish Shah     | 23 | Canada |
| 45 | Kevin Peterson | 23 | USA    |

Before getting into the examples of using subqueries with different clauses and keywords, let's first understand why to use subqueries?

Subqueries are used to execute a query dependent on the outcome of another query. Subqueries allow you to accomplish the same task without writing two distinct queries. Subqueries are majorly used with INSERT, DELETE, SELECT, UPDATE statements along with the comparison operators like <, >, >=, <=, =, BETWEEN, IN, etc.

Subquery Rules

There are various rules that a subquery must follow, like:

  • Subqueries must be enclosed within parenthesis.

    For example:

    SELECT * 
    FROM Students
    WHERE marks = (
        SELECT max(marks)
        FROM Students
    );
    

    In the above example, the query which is enclosed in the parenthesis is called a subquery.

  • Subqueries always runs first followed by the main query.

  • The subqueries in MySQL cannot use the ORDER BY keyword whereas the main query can use the ORDER BY keywordwhenever required.

  • You cannot use BETWEEN operator within a SQL subquery. However, it can be used with the main query.

  • An ORDER BY command cannot be used in the subquery although the main query can use the ORDER BY command. You can use the GROUP BY command to perform the same function as the ORDER BY in the subquery.

  • It is not possible for subqueries to be enclosed in a set of functions.

  • Subqueries that are returning multiple rows can only be used together with the numerous value operators. Such operators include the IN operator.

  • The SELECT list must not include any references evaluating to a CLOB, NCLOB, BLOB, or ARRAY.

  • If there is a subquery that returns multiple rows together then it must be used with the multiple value operators only, like an IN operator.

To understand the subqueries with the different keywords in SQL, the following table is used as a reference throughout the article.

Employee:

emp_idemp_nameemp_ageemp_deptemp_salaryemp_country
121Lisa Carol27Finance30,000USA
122Farooq Shaikh29HR20,000USA
123Nastya Henry28Technology45,000UK
124Christine Maybach30Research50,000USA
125Ryan Renolds27Research50,000Canada

Subqueries with SELECT Statement

In most cases, subqueries are used with the SELECT statements.

Syntax:

SELECT column_name
FROM table_name
WHERE column_name expression operator (
    SELECT column_name
    FROM table_name
    WHERE ...
);

To understand more clearly let's take the above mentioned Employee table and run the below mentioned query and observe the output.

Example:

SELECT emp_name from Employee
WHERE emp_salary = (
    SELECT max(emp_salary)
    FROM Employee
);

Do not worry about the working of the above query, you'll get to know in the next section but let's first see what's the result when the query is run on the Employee table.

The final result of the above query is:

Christine Maybach
Ryan Renolds

Let's break the query statements to understand it better. In this example, the employee name having the maximum salary is filtered out and is displayed as the result. To achieve this result a subquery is created which finds the maximum salary using the max() function as shown below:

SELECT max(emp_salary)
FROM Employee

The above query acts as a subquery in our example and returns the value 50,000. Now, to further fetch the employee name having the salary as 50,000 the below query is applied.

SELECT emp_name from Employee
WHERE emp_salary = (
    SELECT max(emp_salary)
    FROM Employee
);

Here, the query to find out the maximum salary is the subquery. The outer query takes the result of the inner query and executes the remaining SQL commands based on that result. So, the subquery returns the maximum salary, i.e. 50,000. And this result is passed on to the outer query which fetches the name of the employee having the salary of 50,000.

Subqueries with INSERT statement

After the SELECT statement, the second-best option to use subqueries is with INSERT statements. The selected data in the subquery can be further modified through any characters, number functions, etc.

The INSERT statement uses data returned from the subquery to enter into another table.

Syntax:

INSERT INTO table_name
SELECT *
FROM table_name
WHERE VALUE OPERATOR

To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database and run the below mentioned query and observe the output.

Example:

INSERT INTO EmployeeNew
SELECT * FROM Employee
WHERE emp_id IN (
    SELECT emp_id
    FROM Employee
);

The final result of the above query i.e. the EmployeeNew table would look like below:

EmployeeNew:

121 | Lisa Carol | 27 | Finance |30000 | USA |
122 | Farooq Shaikh | 29 | HR| 20000 | USA |
123 | Nastya Henry | 28 | Technology | 45000 | UK |
124 | Christine Maybach | 30 | Research | 50000 | USA|
125 | Ryan Renolds | 27 | Research | 50000 | Canada |

Here, the EmployeeNew table is empty and the above query copies all the columns from the Employee table to the EmployeeNew table. Let's break the query statements to understand clearly.

The INSERT statement specifies that the new data is added to this table i.e. EmployeeNew table. As we are copying all the data from the Employee table to the EmployeeNew table, there is no need to specify the column name in the INSERT statement otherwise you would need to mention the column name in which you want to add the data.

Consider the EmployeeNew table which is created above and this is also used in the below examples.

EmployeeNew:

emp_idemp_nameemp_ageemp_deptemp_salaryemp_country
121Lisa Carol27Finance30000USA
122Farooq Shaikh29HR20000USA
123Nastya Henry28Technology45000UK
124Christine Maybach30Research50000USA
125Ryan Renolds27Research50000Canada

Example:

Let's take another example in which only the employees whose age is less than 29 are inserted into a new table i.e. EmployeeNew table using the subqueries in the INSERT statement.

INSERT INTO EmployeeNew
SELECT * FROM Employeee
WHERE emp_age IN (
    SELECT emp_age
    FROM Employee
    WHERE emp_age < 29
);

The final result of the above query would look like the below:

EmployeeNew:

| 121 | Lisa Carol | 27 | Finance | 30000 | USA |
| 123 | Nastya Henry | 28 | Technology | 45000 | UK |
| 125 | Ryan Renolds | 27 | Research | 50000 | Canada |

In this example, the subquery filters out the result and the employees whose age is less than 29 are included in the result set of the inner query or subquery. Then, this result is passed on to the outer query, and all the employees' data from the Employee table whose age is less than 29 are added to the EmployeeNew table as shown in the final result above.

Subqueries with UPDATE Statement

The UPDATE statement is used to modify an existing record within the table. To benefit from the usage of subqueries the WHERE clause should be used with the UPDATE statement otherwise the overall functioning of the table may get disturbed. In the UPDATE statement, the two clauses in which subqueries are most commonly used are SET and WHERE.

Syntax:

UPDATE Employee
SET column_name = new_value
WHERE Operator (
    SELECT COLUMN_NAME
    FROM TABLE_NAME
    WHERE ...
);

To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database and run the below mentioned query and observe the output.

Example:

UPDATE Employee
SET emp_salary = emp_salary * 2
WHERE emp_age IN (
    SELECT emp_age
    FROM EmployeeNew
    WHERE emp_age > 27
);

The final result of the above query would look like the below:

Employee:

| 121 | Lisa Carol | 27 | Finance | 30000 | USA |
| 122 | Farooq Shaikh | 29 | HR | 40000 | USA |
| 123 | Nastya Henry | 28 | Technology | 90000 | UK |
| 124 | Christine Maybach | 30 | Research | 100000|USA |
| 125 | Ryan Renolds | 27 | Research | 50000 | Canada |**

In this example, the Employee table is updated using UPDATE and SET. Here, the SET clause defines the new value for the emp_salary column which is being modified by the UPDATE statement. As you can see in the subquery, the employees having age greater than 27 are filtered out and this result set is passed on to the outer query. The outer query updates the emp_salary column while considering the result set of the subquery i.e. salary of all the Employees having an age greater than 27 will increase by two times as mentioned in the outer query.

Subqueries with DELETE Statement

The subqueries can also be used with the DELETE statements to delete one or multiple records from the table.

Syntax:

DELETE FROM TableName
WHERE Operator (
    SELECT COLUMN_NAME
    FROM TABLE_NAME
    WHERE condition
);

To understand more clearly let's take the above mentioned Employee table and also consider that a new employee table i.e. EmployeeNew is available in the database that is mentioned above in the article and run the below mentioned query and observe the output.

DELETE FROM Employee
WHERE emp_dept NOT IN (
    SELECT emp_dept
    FROM EmployeeNew
    WHERE emp_dept = 'Research'
);

The final result of the above query i.e. the Employee table would look like the below:

| 121 | Lisa Carol | 27 | Finance | 30000 | USA |
| 122 | Farooq Shaikh | 29 | HR | 20000 | USA |
| 123 | Nastya Henry | 28 | Technology | 45000 | UK |

In this example, the DELETE statement is used to delete the rows from the Employee table with the help of EmployeeNew table. Firstly, the subquery is executed and the employees from the EmployeeNew table whose department is "Research" are fetched out and this result set is passed on to the outer query.

The outer query then deletes the records from the Employee table who is in the Research department. Note that not all the records that are present in the EmployeeNew and Employee table are deleted but only those records or employees who are in the Research department are deleted.

Subqueries in FROM clause

FROM is used to specify the source from which the data is to be fetched. Here, subqueries in FROM clause create an intermediate table that can be used directly to retrieve the results for the main SELECT query or joined with the other tables and then used subsequently.

To understand more clearly let's create a new Employee table and the Department table and run the below mentioned query and observe the output.

Department:

dept_iddept_namedept_headhead
0330FinanceMr Patel121
0335HRMrs Shah123
0340TechnologyMr Ramchandani125
0345ResearchMr Bajaj122

Employee:

emp_idemp_namedept_idemp_deptemp_salaryemp_country
121Lisa Carol0330Finance30,000USA
122Farooq Shaikh0335HR20,000USA
123Nastya Henry0340Technology45,000UK
124Christine Maybach0345Research50,000USA
125Ryan Renolds0345Research50,000Canada
SELECT dept_id, emp_count
FROM (
    SELECT count(emp_id) AS "emp_count", dept_id
    FROM Employee
    GROUP BY dept_id
) AS employee_details
ORDER BY emp_count

The final result of the above query would look like below:

| 330 | 1
| 335 | 1
| 340 | 1
| 345 | 2

Subqueries in WHERE Clause

To filter the rows from the result set, subqueries in the WHERE clause can be used by comparing a column in the main table with the results of the subquery.

Syntax:

SELECT column_name
FROM table_name
WHERE column_name expression (
    SELECT column_name
    FROM TABLE_NAME
);

To understand more clearly let's take the above mentioned Employee table and Department table is available in the database and run the below mentioned query and observe the output.

Example:

SELECT dept_name
FROM department
WHERE head IN (
    SELECT emp_id
    FROM Employee
    WHERE emp_country = 'USA' 
);

The final result of the above query is:

Finance
Research

In this example, the subquery extracts out the rows in which the employee's country is the USA and emp_id corresponding to that employee is noted. The result set of this subquery is passed on to the outer query and the outer query looks for the values of head 121, 122, 124 because these are the head values corresponding to the employees residing in the USA. But only 121 and 122 head values are present in the Department table and thus dept_name associated with these head values are printed in the result set of the above query.

Subquery with IN and NOT IN, EXISTS, Not EXIST

  • Subquery with IN and NOT IN

    Subqueries can be used with the IN and NOT IN keywords and the result of after executing these subqueries and result into zero or more values. The result set of this subquery is used by the outer query to further filter out the results.

    Let's take an example to understand how the subqueries work with the IN and NOT IN keywords.

Example:

SELECT * FROM Employee
WHERE emp_dept IN (
   SELECT dept_name
   FROM department
   WHERE dept_name = 'Finance'
);

The final result of the above query would look like below:

| 121 | Lisa Carol | 330 | Finance | 30000 | USA |

In this example, the subquery is used with the IN keyword. Here, the subquery extracts out the record from teh department table whose department is Finance. Then the result set of this query is used by the outer query to further execute the remaining query. The outer query then prints the result from the Employee table where the Employee department belongs to Finance and thus only one record i.e. on employee is available in the Employee table who is in the Finance department.

Example:

SELECT * FROM Employee
WHERE emp_dept NOT IN (
   SELECT dept_name
   FROM department
   WHERE dept_name = 'Finance'
);

The final result of the above query would look like below:

| 122 | Farooq Shaikh | 335 | HR | 20000 | USA |
| 123 | Nastya Henry | 340 | Technology | 45000 | UK |
| 124 | Christine Maybach | 345 | Research|50000 | USA |
| 125 | Ryan Renolds | 345 | Research | 50000 | Canada |

In the above example, the subquery filters out the rows from the department table whose department is Finance. The result set of this subquery is used by the outer query and all the records from the Employee table are available in the result set except those employees who are in Finance department. Therefore, the final result set of the above query contains all the employees except the one in Finance department.

  • Subquery with EXISTS and NOT EXISTS

    Subqueries with the EXISTS keyword can be used to know whether any row exists on the table or not. Let's take the examples below to understand more about how EXISTS and NOT EXISTS work with the subqueries.

    Subqueries that uses EXISTS keyword will return TRUE if the subquery returns any rows. If the subquery using EXISTS keyword does not return any rows then it will return FALSE.

Example:

SELECT * FROM Employee
WHERE EXISTS (
  SELECT null
)

The final result of the above query would look like below:

| 121 | Lisa Carol | 330 | Finance | 30000 | USA |
| 122 | Farooq Shaikh | 335 | HR | 20000 | USA |
| 123 | Nastya Henry|340 | Technology | 45000 | UK |
| 124 | Christine Maybach | 345 | Research | 50000 | USA |
| 125 | Ryan Renolds | 345 | Research | 50000 | Canada |

In the above example, the result set returned by the subquery contains NULL which causes the EXISTS operator to return to TRUE. Therefore, the above given query returns all the rows from the Employee table.

Example:

SELECT * FROM Employee
WHERE EXISTS (
    SELECT * FROM department
    WHERE Employee.emp_id = department.head
);

The final result of the above query would look like below:

| 121 | Lisa Carol | 330 | Finance | 30000 | USA |
| 122 | Farooq Shaikh | 335 | HR | 20000 | USA |
| 123| Nastya Henry | 340 | Technology | 45000 | UK |
| 125 | Ryan Renolds | 345 | Research | 50000 | Canada |

In the above example, the EXISTS keyword is used with the subquery. In the subquery all the records from the Employee table which has same emp_id as that in head of the department table is printed out in the result set. Therefore, all the records from the Employee table where ther is atleast one record from the department table with the same emp_id is available in the result set.

Example:

SELECT * FROM Employee
WHERE NOT EXISTS (
    SELECT * FROM department
    WHERE Employee.emp_id = department.head
);

The final result of the above query would look like the below:

| 124 | Christine Maybach | 345 | Research | 50000 | USA |

As shown in the above example, the subquery checks if there is any record available which has same emp_id from the Employee table and head from the department table. The result set of this query is then used by the outer query. The outer query consists of NOT EXISTS condition so all the records that are not present in the result set of the subquery is printed out in the final result of the query i.e. only one record which has emp_id = 345 is available in the result set of the above given example.

Subquery in HAVING Clause

Subqueries can also be used with the HAVING clause which filters the groups for the result set, by comparing a column in the main table with the results of the subquery.

To understand more clearly let's take the above mentioned Employee table and Department table available in the database and run the below mentioned query and observe the output.

SELECT dept_name, count(emp_id)
FROM department 

INNER JOIN Employee
ON department.dept_id = employee.emp_id

GROUP BY dept_name
HAVING count(emp_id) > (
    SELECT count(emp_id)
    FROM Employee
    WHERE emp_country = 'USA' 
);

The final result of the above query is:

| Research | 2 |

Nested and Correlated Subqueries

Nested queries mean when the subquery is executed first and its results are inserted into the WHERE clause of the outer query. In the case of the nested query. the subquery is independent of the outer query and both can run independently. The subqueries in the nested query are executed only once and the result set after executing the subquery is used by the outer query.

In a correlated subquery, the main query is analysed first and based on its results the subquery is initiated and executed. It is called correlated because the inner query references the column of the outer query. The correlated subquery is slow as the subquery is executed for every row returned by the outer query.

Check out this article to learn more about Types of Subqueries.

Conclusion

  • You learned about SQL subqueries and how these can be super useful and flexible to use to filter out data from the tables in SQL.
  • SQL subqueries are nested queries embedded inside the other query. Subqueries are good alternative to SQL joins as they increase efficiency.
  • In a subquery, the outer query's result is dependent on the result-set of the inner subquery.
  • The main query can use the ORDER BY command but an ORDER BY command cannot be used in the subquery. You can use the GROUP BY command to perform the same function as the ORDER BY in the subquery.
  • The result returned by a single row subquery can be set to the new column value using an UPDATE statement.
  • Subqueries are also used with the HAVING, WHERE, clauses.
  • With the DELETE statement subqueries can be used and it can be useful to delete the records from various tables.
Challenge Time!
quiz
quiz
Time to test your skills and win rewards! Note: Rewards will be credited after the next product update.
Free Courses by top Scaler instructors
rcbGet a Free personalized Career Roadmap from