Summary: in this tutorial, you will learn how to use the SQL PARTITION BY
clause to change how the window function calculates the result.
Introduction to the SQL PARTITION BY clause #
The PARTITION BY
clause is a subclause of the OVER
clause. The PARTITION BY
clause divides a query’s result set into partitions. The window function is operated on each partition separately and recalculates for each partition.
The following shows the syntax of the PARTITION BY
clause:
window_function ( expression ) OVER (
PARTITION BY expression1, expression2, ...
order_clause
frame_clause
)
Code language: SQL (Structured Query Language) (sql)
You can specify one or more columns or expressions to partition the result set. The expression1
, expression1
, etc., can only refer to the columns derived by the FROM
clause. They cannot refer to expressions or aliases in the select list.
The expressions of the PARTITION BY
clause can be column expressions, scalar subquery, or scalar function. Note that a scalar subquery and scalar function always returns a single value.
If you omit the PARTITION BY
clause, the whole result set is treated as a single partition.
PARTITION BY vs. GROUP BY #
The GROUP BY
clause is used often used in conjunction with an aggregate function such as SUM()
and AVG()
. The GROUP BY
clause reduces the number of rows returned by rolling them up and calculating the sums or averages for each group.
For example, the following statement returns the average salary of employees by departments:
SELECT
department_id,
ROUND(AVG(salary)) avg_department_salary
FROM
employees
GROUP BY
department_id
ORDER BY
department_id;
Code language: SQL (Structured Query Language) (sql)
The following picture shows the result:
The PARTITION BY
clause divides the result set into partitions and changes how the window function is calculated. The PARTITION BY
clause does not reduce the number of rows returned.
The following statement returns the employee’s salary and also the average salary of the employee’s department:
SELECT
first_name,
last_name,
department_id,
ROUND(AVG(salary) OVER (
PARTITION BY department_id
)) avg_department_salary
FROM
employees;
Code language: SQL (Structured Query Language) (sql)
Here is the partial output:
In simple words, the GROUP BY
clause is aggregate while the PARTITION BY
clause is analytic.
In this tutorial, you have learned about the SQL PARTITION BY
clause that changes how the window function’s result is calculated.