SQL Server GROUP BY clause
Group By . clause
The GROUP BY clause divides the result set into one or more subsets. Each subset has common values and expressions. The GROUP BY keyword is followed by a list of columns, called grouped column . Each grouped column restricts the number of rows of the result set. For each grouped column, there is only one row. The GROUP BY clause can have more than one grouped column.
SELECT select_list FROM table_name GROUP BY column_name1,column_name2, ... ;
Consider the scenario of the WorkOrderRouting table in AdventureWorks2019 . Total resource hours for each work order must be calculated. To achieve this, the records must be grouped by the work order number, which is the WorkOrderID column
The example below retrieves and displays the total resource hours for each work order along with the work order number. In this query, a built-in function named SUM() is used to calculate the sum. SUM() is a function that sums a column’s records.
SELECT WorkOrderID,SUM(ActualResourceHrs) AS TotalHoursPerWorkOrder FROM Production.WorkOrderRouting GROUP BY WorkOrderID
The execute statement will return all WorkOrders and total resource hours.
The GROUP BY clause can also be used in conjunction with other clauses such as:
GROUP BY with WHERE
The WHERE clause can be used with the GROUP BY clause to restrict rows for data grouping. Rows that satisfy the search criteria will be considered for regrouping. Rows that do not satisfy the search criteria will be removed before the grouping process is complete.
SELECt WorkOrderID,SUM(ActualResourceHrs) AS TotalHoursPerWorkOrder FROM Production.WorkOrderRouting WHERE WorkOrderID < 50 GROUP BY WorkOrderID
GROUP BY with NULL
If the group column contains null, the row becomes a separate group in the result set. If the group column contains more than one NULL value, the NULL values are included in a single row. Consider the Production.Product table. There are some rows where there is a NULL value in the Class column.
Using GROUP BY for a table query will also use NULL values. For example, the code below retrieves and displays the average price of the list price for each Class
SELECT Class, AVG(ListPrice) AS 'AverageListPrice' FROM Production.Product GROUP BY Class
GROUP BY with ALL
The ALL keyword can be used with the GROUP BY clause. It only makes sense when the SELECT has a WHERE clause. When ALL is used, it includes all groups that the GROUP BY clause creates. It even includes groups that do not meet the search conditions.
SELECT <column_name> FROM <table_name> WHERE <condition> GROUP BY ALL <column_name>
Consider the Sales.SalesTerritory table, which has a column named Group that indicates the coordinate location of the sales area. The code below will calculate and display the total sales for each group. The output needs to show all groups regardless of whether they have sales or not. To achieve this, the code uses GROUP BY with ALL
SELECT [Group], SUM(SalesYTD) AS 'TotalSales' FROM Sales.SalesTerritory WHERE [Group] LIKE 'N%' OR [Group] LIKE 'E%' GROUP BY ALL [Group]
GROUP BY with HAVING
The HAVING clause is used with the SELECT statement to specify a search condition for a group. The HAVING clause acts like a WHERE clause in places where the WHERE clause cannot be used against aggregate functions like SUM(). Once you’ve created groups with the GROUP BY clause, you may want to further filter the results. The HAVING clause acts as a filter on lines, similar to how the WHERE clause acts as a filter on the rows returned by the FROM clause.
SELECT <column_name> FROM <table_name> GROUP BY <column_name> HAVING <search_condition>
SELECT [Group],SUM(SalesYTD) AS 'TotalSales' FROM Sales.SalesTerritory WHERE [Group] LIKE 'P%' GROUP BY ALL [Group] HAVING SUM (SalesYTD) < 6000000
The GROUP BY clause also uses the CUBE and ROLLUP operators to return summary data. The number of columns in the GROUP BY clause determines the number of summary rows in the result set. The operators are described as follows:
CUBE is the aggregate operator that generates a super aggregate row. In addition to the regular rows provided by the GROUP BY, it also provides a summary of the rows that the GROUP BY clause produces.
The summary row displayed for every possible combination of groups is the result set. The Summary row shows NULL in the result set, but returns all values for those at the same time.
SELECT <column_name> FROM <table_name> GROUP BY <column_name> WITH CUBE
The service shows the total sales for each country, excluding Australia and Canada
SELECT Name, CountryRegionCode, SUM(SalesYTD) AS TotalSales FROM Sales.SalesTerritory WHERE Name <> 'Australia' AND Name <> 'Canada' GROUP BY Name,CountryRegionCode WITH CUBE
CUBE is like an extension of the GROUP BY clause. CUBE allows you to subtotal for all group column combinations specified in the GROUP BY clause.
In addition to the regular rows generated by GROUP BY, it also injects summary rows into the result set. It is similar to the CUBE operator, but produces a result set that displays the groups in hierarchical order. It sorts the groups from lowest to highest. The grouping hierarchy in the results depends on the order in which the grouped columns are specified
SELECT <column_name1>[,<column_name2>] FROM <table_name> GROUP BY < WITH ROLLUP
SELECT [Name], SUM(SalesYTD) AS TotalSales FROM Sales.SalesTerritory GROUP BY [Name] WITH ROLLUP