Aggregate Functions in SQL

Learn via video course
FREE
View all courses
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
DBMS Course - Master the Fundamentals and Advanced Concepts
DBMS Course - Master the Fundamentals and Advanced Concepts
by Srikanth Varma
1000
5
Start Learning
Topics Covered

Aggregate functions in SQL perform calculation on multiple values and return a single value. SQL provides many aggregate functions that include avg, count, sum, min, max, etc. An aggregate function ignores NULL values when it performs the calculation, except for the count function.

What are Aggregate Functions in SQL?

Aggregate functions in SQL are used to perform operations on multiple rows of a table and return a single value. A single function can be used to compute multiple values and return the required result.

There are 5 types of SQL aggregate functions:

  • Count()
  • Sum()
  • Avg()
  • Min()
  • Max()
  • Count():
    • Count(*): Provides the total number of records in the dataset, which is 8.
    • Count(Score): Presents the count of non-null values in the Score column, indicating there are 7 such values.
    • Count(Distinct Score): Indicates the count of distinct non-null values in the Score column, totaling 5 unique scores.
  • Sum():
    • sum(Score): Aggregates the sum of all non-null values in the Score column, resulting in a sum of 490.
    • sum(Distinct Score): Computes the sum of distinct non-null values in the Score column, amounting to 423 (85+78+90+92+65).
  • Avg():
    • Avg(Score): Computes the average score by dividing the sum of all scores by the count of non-null scores, resulting in 490 / 7 .
    • Avg(Distinct Score): Calculates the average of distinct scores by dividing the sum of distinct scores by the count of distinct scores, yielding approximately 423 / 5.
  • Min():
    • Min(Score): Identifies the minimum score in the Score column, excluding null values, which is 65.
  • Max():
    • Max(Score): Identifies the maximum score in the Score column, which is 92.

SQL Aggregate Function

Example

Before we dig deep into aggregate functions in SQL, let us take a look at the sample data we’ll be using. The given table named EMP_DATA consists of data concerning 10 employees working in the same organization in different departments.

The table stores a unique EMPLOYEE_ID, Name of the employee, department in which the employee is working, and the salary.

Employee_IDNameDepartmentSalary
1RamMarketing80000
2HenryProduction76000
3DishaR&D76000
4HelenR&D84000
5MeeraMarketing80000
6AshishProduction64000
7BobProduction60000
8HariR&D60000
9PreetiMarketingNULL
10MarkProduction66000

Let us dig into each aggregate functions in SQL with the help of the above table EMP_DATA. Now coming back to the aggregate functions in SQL, they can be of the following types as shown in the figure. We will be understanding the working of these functions one by one.

1. COUNT() Function

The COUNT() aggregate function returns the total number of rows from a database table that matches the defined criteria in the SQL query.

Syntax:

COUNT(*) returns the total number of rows in a given table. COUNT(COULUMN_NAME) returns the total number of non-null values present in the column which is passed as an argument in the function.

Let’s take a look at a few examples to understand the function better.

Example:

Suppose you want to know the total number of employees working in the organization. You can do so by the below-given query.

As COUNT(*) returns the total number of rows and the table named EMP_DATA provided above consists of 10 rows, so the COUNT(*) function returns 10. The output is printed as shown below.

Output:

Note: Except for COUNT(*), all other SQL aggregate functions ignore NULL values.

Suppose you need to count the number of people who are getting a salary. The query given below can help you achieve this.

Here, the Salary column is passed as a parameter to the COUNT() function, and hence, this query returns the number of non NULL values from the column Salary, i.e. 9.

Output:

Suppose you need to count the number of distinct departments present in the organization. The following query can help you achieve this.

The above query returns the total number of distinct non NULL values over the column Department i.e. 3 (Marketing, Production, R&D). The DISTINCT keyword makes sure that only non-repetitive values are counted.

Output:

What if you want to calculate the number of people whose salaries are more than a given amount(say 70,000)? Check out the example below.

The query returns the number of rows where the salary of the employee is greater than or equal to 70,000 i.e 5.

Output:

2. SUM() Function

The SUM() function takes the name of the column as an argument and returns the sum of all the non NULL values in that column. It works only on numeric fields(i.e the columns contain only numeric values). When applied to columns containing both non-numeric(ex - strings) and numeric values, only numeric values are considered. If no numeric values are present, the function returns 0.

Syntax:

The function name is SUM() and the name of the column to be considered is passed as an argument to the function.

Let’s look into some examples to understand the usage better.

Example:

Suppose you need to build a budget for the organization and you need to know the total amount needed to provide salary to all the employees. To calculate the sum of all the values present in column Salary. You can refer to the below-given example.

The above mentioned query returns the sum of all non NULL values over the column Salary i.e 80000 + 76000 + 76000 + 84000 + 80000 + 64000 + 60000 + 60000 + 66000 = 646000

Output:

What if you need to consider only distinct salaries? The following query will help you achieve that.

The DISTINCT keyword makes sure that only non-repetitive values are considered. The query returns the sum of all distinct non NULL values over the column Salary i.e. 80000 + 76000 + 84000 + 64000 + 60000 + 66000 = 430000

Output:

Suppose you need to know the collective salaries for each department(say Marketing). The query given below can help you achieve this.

The query returns the sum of salaries of employees who are working in the Marketing Department i.e 80000 + 80000 = 160000

Output:

Note: There are 3 rows consisting of Marketing as Department value but the third value is a NULL value. Thus, the sum is returned considering only the first two entries having Marketing as Department.

3. AVG() Function

The AVG() aggregate function uses the name of the column as an argument and returns the average of all the non NULL values in that column. It works only on numeric fields(i.e the columns contain only numeric values).
Note: When applied to columns containing both non-numeric(ex - strings) and numeric values, only numeric values are considered. If no numeric values are present, the function returns 0.

Syntax:

The function name is AVG() and the name of the column to be considered is passed as an argument to the function.

Let's take a look at some examples to understand the function better.

Example:

To obtain the average salary of an employee of an organization, the following query can be used.

Here, the column name Salary is passed as an argument and thus the values present in column Salary are considered. The above query returns the average of all non NULL values present in the Salary column of the table.

Average = (80000 + 76000 + 76000 + 84000 + 80000 + 64000 + 60000 + 60000 + 66000 ) / 9 = 646000 / 9 = 71777.77777

Output:

If you need to consider only distinct salaries, the following query will help you out.

The query returns the average of all non NULL distinct values present in the Salary column of the table.

Average = (80000 + 76000 + 84000 + 64000 + 60000 + 66000) / 6 = 430000/ 6 = 71666.66666

Output:

4. MIN() Function

The MIN() function takes the name of the column as an argument and returns the minimum value present in the column. MIN() returns NULL when no row is selected.

Syntax:

The function name is MIN() and the name of the column to be considered is passed as an argument to the function.

To understand this better, let’s take a look at some examples.

Example:

Suppose you want to find out what is the minimum salary that is provided by the organization. The MIN() function can be used here with the column name as an argument.

The query returns the minimum value of all the values present in the mentioned column i.e 60000.

Output:

Suppose you need to know the minimum salary of an employee belonging to the Production department. The following query will help you achieve that.

The query returns the minimum value of all the values present in the mentioned column and has Production as Department value i.e 60000.

Output:

5. MAX() Function in SQL

The MAX() function takes the name of the column as an argument and returns the maximum value present in the column. MAX() returns NULL when no row is selected.

Syntax:

The function name is MAX() and the name of the column to be considered is passed as an argument to the function.

To get a better idea how the function works, let’s look at some examples.

Example:

Suppose you want to find out what is the maximum salary that is provided by the organization. The MAX() function can be used here with the column name as an argument.

The query returns the maximum value of all the values present in the mentioned column i.e 84000.

Output:

Suppose you need to know the maximum salary of an employee belonging to the R&D department. The following query will help you achieve that.

The query returns the maximum value of all the values present in the mentioned column and has R&D as Department value i.e 84000.

Why Use Aggregate Functions?

  • Aggregate functions in SQL play a pivotal role in database management systems, enabling swift and efficient calculations on extensive datasets.
  • They facilitate tasks such as generating statistical reports, conducting financial analysis, and managing inventory levels.
  • Additionally, aggregate functions in SQL aid in comprehending data by simplifying tasks such as calculating average prices or total sales for specific periods, tasks that would otherwise be laborious and error-prone without them.
  • In essence, aggregate functions in SQL are indispensable tools for individuals handling substantial data volumes, allowing them to extract valuable insights effectively.

Conclusion

  • Aggregate functions in SQL are an exceptionally useful feature in data retrieval from the database. The fact that you can query data and apply functions in the same query is remarkable
  • A single line query can be used to perform operations on multiple records. Using an SQL aggregate function saves time as well as avoids any error.