· SQL Guroo · SQL Tutorial  · 3 min read

SQL Window Functions - Understanding PARTITION BY

Learn how to use PARTITION BY in SQL with simple examples.

Learn how to use PARTITION BY in SQL with simple examples.

Introduction

PARTITION BY divides your data into groups (partitions) for calculations. Think of it like sorting items into different boxes and then performing calculations within each box separately.

Why PARTITION BY Matters?

It’s useful when you need to:

  • Calculate averages per group
  • Find totals within categories
  • Compare values within groups
  • Perform calculations for separate categories

Main Content: Understanding PARTITION BY

Let’s use this simple table:

-- Create a simple table
CREATE TABLE class_scores (
    student_name VARCHAR(50),
    class VARCHAR(50),
    score INT
);

-- Insert sample data
INSERT INTO class_scores VALUES
('Alice', 'Math', 85),
('Alice', 'Science', 90),
('Bob', 'Math', 78),
('Bob', 'Science', 85),
('Charlie', 'Math', 92),
('Charlie', 'Science', 88),
('David', 'Math', 88),
('David', 'Science', 95);

Example 1: Basic PARTITION BY

Let’s compare each score to the class average:

SELECT 
    student_name,
    class,
    score,
    AVG(score) OVER (PARTITION BY class) as class_average
FROM class_scores
ORDER BY class, score DESC;

This query:

  1. Groups scores by class
  2. Calculates average for each class
  3. Shows individual scores and class averages

Results:

student_name | class   | score | class_average
-------------|---------|-------|---------------
Charlie      | Math    | 92    | 85.75
David        | Math    | 88    | 85.75
Alice        | Math    | 85    | 85.75
Bob          | Math    | 78    | 85.75
David        | Science | 95    | 89.50
Alice        | Science | 90    | 89.50
Charlie      | Science | 88    | 89.50
Bob          | Science | 85    | 89.50

Example 2: Multiple Calculations

Let’s see minimum, maximum, and average scores per class:

SELECT 
    class,
    student_name,
    score,
    MIN(score) OVER (PARTITION BY class) as lowest_score,
    MAX(score) OVER (PARTITION BY class) as highest_score,
    AVG(score) OVER (PARTITION BY class) as average_score
FROM class_scores
ORDER BY class, score DESC;

This query shows:

  1. Individual scores
  2. Lowest score in each class
  3. Highest score in each class
  4. Average score for each class

Results:

class   | student_name | score | lowest_score | highest_score | average_score
--------|--------------|-------|--------------|---------------|---------------
Math    | Charlie      | 92    | 78          | 92           | 85.75
Math    | David        | 88    | 78          | 92           | 85.75
Math    | Alice        | 85    | 78          | 92           | 85.75
Math    | Bob          | 78    | 78          | 92           | 85.75
Science | David        | 95    | 85          | 95           | 89.50
Science | Alice        | 90    | 85          | 95           | 89.50
Science | Charlie      | 88    | 85          | 95           | 89.50
Science | Bob          | 85    | 85          | 95           | 89.50

Example 3: Comparing to Group Total

Let’s calculate what percentage each score represents of the class total:

SELECT 
    class,
    student_name,
    score,
    SUM(score) OVER (PARTITION BY class) as class_total,
    ROUND(
        (score * 100.0 / SUM(score) OVER (PARTITION BY class)),
    2) as percent_of_total
FROM class_scores
ORDER BY class, score DESC;

This query:

  1. Calculates total score per class
  2. Shows each score as a percentage
  3. Groups calculations by class

Results:

class   | student_name | score | class_total | percent_of_total
--------|--------------|-------|-------------|------------------
Math    | Charlie      | 92    | 343        | 26.82
Math    | David        | 88    | 343        | 25.66
Math    | Alice        | 85    | 343        | 24.78
Math    | Bob          | 78    | 343        | 22.74
Science | David        | 95    | 358        | 26.54
Science | Alice        | 90    | 358        | 25.14
Science | Charlie      | 88    | 358        | 24.58
Science | Bob          | 85    | 358        | 23.74

Best Practices

  1. Choosing Partition Columns

    • Use logical grouping columns
    • Don’t create too many partitions
    • Consider data distribution
  2. Ordering

    • Use ORDER BY for final results
    • Make output easy to read
    • Group related data together
  3. Calculations

    • Keep calculations simple
    • Use meaningful column names
    • Round decimals appropriately
  4. Common Uses

    • Group averages
    • Running totals per group
    • Percentages within groups
    • Rankings within categories
  5. Things to Remember

    • PARTITION BY creates groups
    • Each group gets its own calculations
    • Results show with every row
    • Groups are independent

Remember:

  • PARTITION BY divides data into groups
  • Each group gets separate calculations
  • All rows show their group’s results
  • Perfect for group-level analysis

Happy querying! 🚀

Back to Blog

Related Posts

View All Posts »