The MySQL GROUP BY Statement

Learn via video courses
Topics Covered

The GROUP BY clause in MySQL is essential for aggregating data in table rows based on column values. It efficiently groups data for operations like COUNT, SUM, AVG, MAX, and MIN, often used in SELECT statements to summarize and analyze dataset characteristics.

GROUP BY Syntax

The syntax of the GROUP BY clause in MySQL is as follows

Parameters

  • The notation aggregate_function(column_Z) denotes an aggregate function that can be applied to the data in column_Z.
  • The WHERE clause is optional. It can be used to filter the data before grouping.

MySQL GROUP BY Examples

In this section, we will explore several examples of how to use the clause GROUP BY in MySQL.

Simple MySQL GROUP BY Example

Suppose you have a table named Orders containing three attributes, id(primary key), cust_id (customer id) and the order amount. The Orders table has the following data, and the query used to create this table, insert the data and display the table is also given below:

Orders Table

idcust_idamount
11105
2178
3355
4342
52215

Query used to create the Orders Table:

Query used to insert the above data into the Orders Table

Query used to display the data present in the Orders Table

To group the data by cust_id and find the total amount of orders for each customer, you can use the following query:

Query

Output

cust_idtotal_amount
1183
397
2215

We can see from the above output that the given query finds the total amount of orders for each customer. For example, the total amount for customer ID 1 is 183 (i.e., 105 + 78), for customer ID 2 is 215 and for customer ID 3 it is 97 (i.e., 55 + 42)

Using GROUP BY in MySQL With Aggregate Functions

In the last example, we saw how to use the clause GROUP BY in MySQL with the SUM aggregate function. Let us now discuss how to use other aggregate functions like MIN, MAX, COUNT, and AVG with this clause.

1. MIN aggregate function: The MIN aggregate function is used to find the minimum value of each group. The following query uses the MIN aggregate function along with the GROUP BY clause in MySQL to find the minimum order amount for each customer.

Query

Output

cust_idmin_amount
178
342
2215

We can see from the above output that the given query finds the minimum order amount for each customer. For example, the minimum order amount for customer ID 1 is 78 (i.e., min(105,78)), for customer ID 2 is 215 and for customer ID 3 it is 42 (i.e., min(55,42))

2. MAX aggregate function: The MAX aggregate function is used to find the maximum value of each group. The following query uses the MAX aggregate function along with the GROUP BY clause in MySQL to find the maximum order amount for each customer.

Query

Output

cust_idmax_amount
1105
355
2215

We can see from the above output that the given query finds the maximum order amount for each customer. For example, the maximum order amount for customer ID 1 is 105 (i.e., max(105,78)), for customer ID 2 is 215 and for customer ID 3 it is 55 (i.e., max(55,42))

3. COUNT Aggregate Function: The COUNT aggregate function is used to find the count of values in each group. The following query uses the COUNT aggregate function along with the GROUP BY clause in MySQL to find the count of orders for each customer.

Query

Output

cust_idnum_orders
12
32
21

We can see from the above output that the given query finds the count of orders for each customer. For example, the count of orders for customer ID 1 is 2 (since there are two orders by customer ID 1 worth 105 and 78 units), for customer ID 2 is 1 (since there is only one order worth 215 units) and for customer ID 3 it is 2 (since there are two orders by customer ID 3 worth 55 and 42 units)

4. AVG aggregate function: The AVG aggregate function is used to find the average of data values in each group. The following query uses the AVG aggregate function along with the GROUP BY clause in MySQL to find the average order value for each customer.

Query

Output

cust_idavg_order_value
191.5
348.5
2215

We can see from the above output that the given query finds the average order value for each customer. For example, the average order value for customer ID 1 is 91.5 (i.e., avg(105,78)), for customer ID 2 is 215 and for customer ID 3 it is 48.5 (i.e., avg(55,42))

MySQL GROUP BY With Expression Example

You can also use expressions in the GROUP BY clause to group data based on calculated values. Consider the following query on the same Orders table to understand the usage of GROUP BY with an expression example.

Query

Output

amount_rangecount_orders
Low1
Medium3
High1

This query divides the data into 3 groups based on the order value.

Using MySQL GROUP BY With HAVING Clause Example

You can use the HAVING clause with the GROUP BY clause to filter groups based on aggregate functions. Note that we cannot use the WHERE clause to filter out aggregate functions. This is because aggregate functions operate on multiple rows of a table and return a single value, whereas the WHERE clause filters rows before they are aggregated.

The following query filters customers with an order total order value of less than 200.

Query

Output

cust_idtotal_amount
1183
397

It is known that the sum of the order values for customer ID 1 is 183 (i.e., 105 + 78), for customer ID 2 is 215 and for customer ID 3 it is 97 (i.e., 55 + 42). Now, our query asks to return the IDs of those customers whose total order values are less than or equal to 200. Clearly, except customer with ID 2, all other customers have a total order value of less than or equal to 200. Thus, they are present in the output.

The GROUP BY Clause: MySQL vs SQL Standard

The GROUP BY clause in the SQL standard is similar to the GROUP BY clause in MySQL. Still, there are variations in how MySQL approaches specific situations. For example, a column in the 'SELECT' clause of a SQL statement must either be in the GROUP BY clause or be an aggregate function. This isn't always the case with MySQL. If non-grouped columns are functionally dependent on grouped columns, MySQL permits you to include them in the SELECT clause. Another different example is that the GROUP BY clause in MySQL can be used without specifying any columns in the SELECT clause, whereas according to standard SQL, this is forbidden.

The GROUP BY Clause vs DISTINCT Clause

In database queries, two clauses can be used to eliminate duplicate values: GROUP BY and DISTINCT. While they achieve similar outcomes, they have distinct purposes.

DISTINCT is used to remove duplicate rows from the result set, considering all columns specified in the SELECT statement.

On the other hand, GROUP BY is utilized to group data based on one or more columns and perform aggregate functions on each group. It does not eliminate duplicates from the result set and merges them into groups.

If the objective is to remove duplicate rows, the DISTINCT clause should be used. However, if the aim is to group data and apply aggregate functions to each group, the GROUP BY clause is appropriate.

Limitations of Group By Clause in MySQL

The GROUP BY clause in MySQL is a commonly used clause for aggregating data. Nonetheless, it presents some limitations that any MySQL user should be aware of. In the following points, we shall explore some of the constraints of MySQL's GROUP BY clause.

  • One of the most prominent drawbacks of the GROUP BY clause is the great impact it can have on query performance. With the usage of GROUP BY, MySQL has to sort and group the data, which can be a challenging task, especially when working with large datasets.
  • Moreover, a restriction that arises with the GROUP BY clause is that grouping data by multiple columns is a problem. Suppose you have a table with multiple columns and desire to group by more than one column; in that case, you might run into difficulties.
  • Furthermore, the GROUP BY clause can present issues when attempting to group by columns that contain null values. Under such circumstances, the grouping may not function as expected, leading to undesirable results. To bypass this obstacle, the user may implement the COALESCE method to substitute null values with a default value before grouping.

Conclusion

In conclusion, the GROUP BY clause in MySQL enables you to aggregate functions on each group of data and group data depending on one or more columns. It is a useful tactic for summarising and analyzing data in a database. Some of the great concepts from this article include the following:

  • The syntax of the GROUP BY clause is easy to use.
  • You can use the GROUP BY clause with aggregate functions to perform calculations on grouped data.
  • You can use expressions in the GROUP BY clause to group data based on calculated values.
  • You can filter groups based on aggregate functions using the HAVING clause with the GROUP BY clause.
  • There exist some differences between the GROUP BY clause in MySQL and the SQL standard.
  • The GROUP BY clause and the DISTINCT clause have different uses and should be used accordingly.