Window Functions in Mysql

Learn via video courses
Topics Covered

Overview

Window functions in MySQL are used to perform calculations like aggregation, analytics, and ranking over a set of rows. It works in the same way as the aggregate functions in terms of calculations but the aggregate function returns a single row after computing a whole column whereas a Windows function is applied OVER a set of rows which is known as a Window and it returns the value associated with every row individually. There will be a unique identity maintained for each row.

Introduction

The Window functions in MySQL are special kinds of pre-defined methods that are used to perform computations on a set of rows rather than a particular column. The set of rows on the methods that are applied is referred to as a Window. The row on which the window function is applied is known as the current row. The current row is related to the set of rows on which the computations are applied. As discussed earlier, the window functions in MySQL work in a similar way as the aggregate functions do. But the catch is, the aggregate function is applied on a particular column or a whole table producing output in a single row whereas window functions in MySQL operate on every row from a selected set of rows (Window) and produce an output corresponding to each row.

It is mandatory to use an OVER clause with window functions in MySQL as it is used to define the set of rows (Window) on which the functions should be applied. Apart from that, the OVER clause is used to order the selected rows in groups.

Earlier, window functions were not supported in MySQL. It was introduced in version 8.0.

Syntax:

In the above syntax, the OVER clause is used after the window function to define the partitions, order, and range of the rows.

Here, the PARTITION BY clause is used to make divisions or partitions of rows so we can specify the rows to be used in the window function computation. The ORDER BY clause is used to sort the order of rows that exists in a partition. The ROW or RANGE clause is used to form a subset of rows in the current partition

Both, ORDER BY and ROW or RANGEclauses are optional for the window function.

Now, Let's discuss a bit the types of Window functions in MySQL.

The Window functions are divided into three categories:-

  • Aggregate Window Functions
  • Analytical Window Functions
  • Ranking Window Functions

Aggregate Window Function

Aggregate window functions are the kind of functions that are used to perform mathematical calculations on the data stored in MySQL. They are used to aggregate the data. Some of the most commonly used aggregate functions are as follows:

  • SUM()
  • AVG()
  • MIN()
  • MAX()
  • COUNT()

Analytical Window Function

Analytical window functions are the kind of functions that are used to carry out analysis of the data stored in a database. Some of the most commonly used analytical functions are as follows:

  • NTILE()
  • LAG()
  • LAST_VALUE()
  • FIRST_VALUE()
  • LEAD()

Ranking Window Function

Ranking window functions are the kind of functions that are used to give rankings to each row of a partition. Some of the most commonly used ranking functions are as follows:

  • RANK()
  • DENSE_RANK()
  • ROW_NUMBER()
  • PERCENT_RANK()
  • CUME_DIST()

ROW_NUMBER() Function

ROW_NUMBER is one of the most common ranking window functions in MySQL which provides a normal serial number to each row present within a partition. It is just a normal serial-wise numbering of rows in ascending order. Even the rows which have the same value when using the ORDER BY clause has been given a different row number.

Syntax

Let's understand the use of row_number with an example.

Example

In this example, let's take a table Car which will contain details of cars like the model and release year. We will apply the row_number() function on the table to assign incremental sequences to each row from the table.

Let's consider the following table:

car_namereleased_year
Hyundai Verna2016
Honda City1955
Audi Q32016
BMW X42010
Mercedes Benz2019

Code

Output:

car_namerow_num
Honda City1
Hyundai Verna2
Audi Q33
BMW X44
Mercedes Benz5

Explanation:

In the above example, we have used to table car which contains car_name and released_year columns. The ROW_NUMBER() window function is used here to assign numerical sequences to each row of the table. The rows are ordered by or sorted by the released_year using the ORDER BY clause as seen in the output table.

RANK() and DENSE_RANK() Functions

RANK()

RANK() is a kind of ranking window function in MySQL which is used to assign ranks to the rows within an ordered partition. The rows which contain the same values will be assigned the same rank, with the next ranking or rankings skipped. For example, if we have 4 items with the same value starting with ranking 1 then all 4 items will have rank 1, and the ranking from the 5th item would be 5 with 2,3,4 skipped.

Syntax:

Example

In this example, we will consider a table Car which will contain details of cars like the model and released year and price. We will apply the RANK() function on the table to assign rankings to each row from the table.

Let's consider the table:

car_namereleased_yearprice
Hyundai Verna2016800000
Honda City1955800000
Audi Q320167500000
BMW X420103000000
Mercedes Benz20197500000

Code

Output

car_nameranking
Audi Q31
Mercedes Benz1
BMW X43
Hyundai Verna4
Honda City4

Explanation

In the above example, we have used to table car which contains the car_name, released_year and price columns. The RANK() window function is used here to assign a ranking to each row of the table based on their prices. The rows are ordered by or sorted in descending order by the prices using the ORDER BY clause as seen in the output table.

DENSE_RANK()

The DENSE_RANK() function is a type of ranking window function in MySQL which is used to assign rankings to the rows within an ordered partition. DENSE_RANK() is almost as same as the RANK() window function but in DENSE_RANK() no rankings are skipped even if the rows have the same value.

Syntax

Example

Let's take the same example we have used in the RANK() function section. Here, we will apply the DENSE_RANK() method to assign rankings without skipping the ranks.

Code

Output

car_nameranking
Audi Q31
Mercedes Benz1
BMW X42
Hyundai Verna3
Honda City3

Explanation:

As seen in the previous example, the ranking is assigned based on price. Here, with the use of DENSE_RANK() the rankings are assigned and displayed with the car_name attribute. As seen in the output, the rankings are assigned without skipping the previous ranks even if the rows contain the same value.

LAG() and LEAD() Functions

LAG() and LEAD() both are types of analytical window functions in MySQL. As the name suggests, the LAG() function starts the value from the previous row (returns NULL if starting from the first row as no preceding row exists). It returns the value of the row before the current row of the partition. Whereas the LEAD() function returns the value of the row after the current row in a partition. It starts from the value of the next row (returns NULL if no row if no more rows are available).

Syntax (LAG)

Syntax (LEAD)

Example

Here, we will consider the same table car which contains different car details. We will apply both LAG() and LEAD window functions to demonstrate their practical usage.

Consider a table car:

car_namereleased_yearprice
Hyundai Verna2016800000
Honda City1955800000
Audi Q320167500000
BMW X420103000000
Mercedes Benz20197500000

Code (LAG)

Output

car_namepricelag_column
Honda City800000NULL
BMW X43000000800000
Hyundai Verna8000003000000
Audi Q37500000800000
Mercedes Benz75000007500000

Code (LEAD)

Output

car_namepricelead_column
Honda City8000003000000
BMW X43000000800000
Hyundai Verna8000007500000
Audi Q375000007500000
Mercedes Benz7500000NULL

Explanation

As seen in the above example, we have used both the LAG() and LEAD()functions on the same table and set of rows. The LAG() function returns values from one preceding row i.e., here it returns NULL in the first row as there was no preceding row available. The LEAD() function returns the value from one subsequent row i.e., here it starts returning the value from the second row and the last row returns the NULL value as there are no subsequent rows available.

FIRST_VALUE() and LAST_VALUE() Functions

FIRST_VALUE() and LAST_VALUE() are one of the most commonly used analytics window functions in MySQL. The FIRST_VALUE() function shows the value from the first row of a window frame or within a partition. LAST_VALUE() will return the value of the last row of a window frame or within a partition.

Syntax(FIRST_VALUE)

Syntax(LAST_VALUE)

Example

In this example, we will consider a table Employee which contains details about the employee working in different departments with their salary. Here, we will use the FIRST_VALUE() and LAST_VALUE()methods to return the name of the employee having the least salary and highest salary from each department respectively.

Consider a table Employee:

employee_namedepartmentsalary
AbdulFinance68000
AmanIT60000
JaiIT45000
MohitIT70000
NileshFinance55000
RaviAccounting60000
SushantFinance65000
VishalAccounting40000

Code(FIRST_VALUE)

Output

employee_namedepartmentsalaryleast_salary
VishalAccounting40000Vishal
RaviAccounting60000Vishal
NileshFinance55000Nilesh
SushantFinance65000Nilesh
AbdulFinance68000Nilesh
JaiIT45000Jai
AmanIT60000Jai
MohitIT70000Jai

Code(LAST_VALUE)

Output

employee_namedepartmentsalarymax_salary
VishalAccounting40000Ravi
RaviAccounting60000Ravi
NileshFinance55000Abdul
SushantFinance65000Abdul
AbdulFinance68000Abdul
JaiIT45000Mohit
AmanIT60000Mohit
MohitIT70000Mohit

Explanation

In the above example, we have used both the FIRST_VALUE() and LAST_VALUE() methods on the same table. Here, using the FIRST_VALUE() method we've returned the name of the employee having least_salary for every partition (department). And using the LAST_VALUE() we've returned the name of the employee having max_salary for every partition (department).

NOTE: RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is a clause that defines the frame in the partition from the first row to the last row.

NTILE() Function

NTILE() function is a kind of analytical window function in MySQL which is used to divide rows into groups within an ordered partition. Each of the divided groups will be assigned a number which will be the same for that group of rows. Every group is represented by a number.

Syntax

Example

Here the same table Employee is used which will be divided into groups using the NTILE() window function.

Code

Output

employee_namedepartmentsalarygrouping
VishalAccounting400001
RaviAccounting600001
NileshFinance550001
SushantFinance650002
AbdulFinance680002
JaiIT450002
AmanIT600003
MohitIT700003

Explanation

Here, we have considered the same Employee table used in the previous examples. We've used the NTILE() to form a total of 3 groups from the rows of the Employee table. The first group contains 3 rows, the second group contains 3 rows and the last group contains 2 rows.

Using Window Functions in Queries

Let's see an example to display the window functions in MySQL queries.

Example

In this example, we will use three Aggregate window functions MIN(), MAX(), and COUNT() to find the minimum price, maximum price, and count of cars in respective years.

Let's consider a table car:

car_namereleased_yearprice
Hyundai Verna2016800000
Honda City1955800000
Audi Q320167500000
BMW X420103000000
BMW X520102000000
Mercedes Benz Q520196700000
Mercedes Benz20197500000

Code

Output

car_namereleased_yearpricemin_pricemax_pricecount_of_cars
Honda City19558000008000008000001
BMW X420103000000200000030000002
BMW X520102000000200000030000002
Hyundai Verna201680000080000075000002
Audi Q32016750000080000075000002
Mercedes Benz20196700000670000075000002
Mercedes Benz Q520197500000670000075000002

Explanation

The table car includes the released year, price, and model name information. Here, we have used three aggregate window functions MIN(), MAX(), and COUNT(). The MIN() function returns the minimum price for every year a car is released. The MAX() function returns the maximum price for every year a car is released. The COUNT() function returns the number of cars released each year.

Performance Considerations

Let's now discuss some of the advantages when performance is taken into consideration for window functions in MySQL:

  • Window functions in MySQL operate on a set of rows(Window) rather than a single column.
  • Window function keeps individual rows and adds a summary column.
  • Window functions in MySQL can be applied on fixed as well as sliding window frames.
  • Window functions in MySQL contain several methods which are used to efficiently extract meaningful information from the database.

Conclusion

Let's summarize whatever we've learned till now:

  • The Window functions in MySQL are special kinds of pre-defined methods that are used to perform computations on a set of rows.
  • The set of rows on the methods that are applied is referred to as a Window.
  • Using an OVER clause with window functions in MySQL is mandatory.
  • The Window functions are divided into three categories:- - Aggregate Window Functions - Analytical Window Functions -Ranking Window Functions`

See Also