Summary: in this tutorial, you’ll learn how to use the SQL CUDE
to generate all possible combinations of groupings for specified columns.
Introduction to SQL CUBE #
The GROUP BY
clause divides rows of a result set into groups specified by one or more column values.
You can apply an aggregate function for each group to return an aggregated value like SUM
, AVG
, MIN
, MAX
, and COUNT
.
The CUBE
is an extension of the GROUP BY
clause, which allows you to generate all possible combinations of groupings for specified columns, including subtotals and the total.
Here’s the syntax of the GROUP BY
clause with CUBE
:
SELECT
column1,
column2,
aggregate_function (column3)
FROM
table_name
GROUP BY
CUBE (column1, column2);
Code language: SQL (Structured Query Language) (sql)
In this syntax, we have two columns specified in the CUBE
. The statement creates two subtotal combinations. Generally, if you have n
number of columns listed in the CUBE
, the statement will create 2n subtotal combinations.
The CUBE
operator works as follows
- First, generate all possible combinations of grouped data for the specified columns.
- Second, create an aggregate value for each combination of columns.
- Third, include a total row that aggregates all the rows.
In practice, you’ll find the CUBE
operator helpful for multidimensional analysis, like summarizing salary data by both states and jobs.
SQL CUBE examples #
We’ll use the CUBE
with the salary_reports
table:
State | Job | Salary |
---|---|---|
California | IT | 150000.00 |
California | Marketing | 130000.00 |
Texas | IT | 100000.00 |
Texas | Marketing | 80000.00 |
Using SQL CUBE with one column #
The following query uses the CUBE
in the GROUP BY
clause to calculate the subtotals of salary by state and the total salary of all states:
SELECT
state,
SUM(salary) AS salary
FROM
salary_reports
GROUP BY
CUBE (state)
ORDER BY
state;
Code language: SQL (Structured Query Language) (sql)
Output:
state | salary
------------+-----------
California | 280000.00
Texas | 180000.00
NULL | 460000.00
Code language: SQL (Structured Query Language) (sql)
The NULL
in the state column indicates the total salary for all states.
You can return a more meaningful label using the COALESCE
function:
SELECT
COALESCE(state, 'Total') state,
SUM(salary) AS salary
FROM
salary_reports
GROUP BY
CUBE (state)
ORDER BY
salary;
Code language: SQL (Structured Query Language) (sql)
Output:
state | salary
------------+-----------
Texas | 180000.00
California | 280000.00
Total | 460000.00
Code language: SQL (Structured Query Language) (sql)
Using SQL CUBE with multiple columns #
The following query analyzes the salary data across all combinations of state and job using the CUBE
in the GROUP BY
clause:
SELECT
state,
job,
SUM(salary) AS salary
FROM
salary_reports
GROUP BY
CUBE (state, job);
Code language: SQL (Structured Query Language) (sql)
Output:
state | job | salary
------------+-----------+-----------
NULL | NULL | 460000.00
California | IT | 150000.00
Texas | Marketing | 80000.00
Texas | IT | 100000.00
California | Marketing | 130000.00
California | NULL | 280000.00
Texas | NULL | 180000.00
NULL | Marketing | 210000.00
NULL | IT | 250000.00
Code language: SQL (Structured Query Language) (sql)
- Total: The row where both state and job are
NULL
represents the total for all salaries. - Subtotals for states: The row where the job column is
NULL
represents the total salary for each state. - Subtotals for jobs: The row where the state column is
NULL
represents the total salary for each job. - Detailed rows: The remaining rows show the total salary for each combination of state and job.
Summary #
- Use SQL
CUBE
for multidimensional analysis.