CodeGym /Courses /SQL SELF /Common Mistakes When Using Window Functions

Common Mistakes When Using Window Functions

SQL SELF
Level 30 , Lesson 4
Available

Now it's time to talk about the tricky stuff that can pop up when you're working with window functions. Like in programming (and in life), it's always better to learn from someone else's mistakes. Let's break down the typical mistakes that beginners—and sometimes even experienced devs—make, and figure out how to dodge them.

Mistake #1: Incorrect Use of PARTITION BY

One of the most common mistakes is forgetting or messing up the PARTITION BY parameter, especially when you want to split your data into groups. Without it, PostgreSQL treats all rows as one big group, so your results might be totally different from what you expect.

Let's say we have a sales table with sales data:

id region month total
1 North 2023-01 1000
2 South 2023-01 800
3 North 2023-02 1200
4 South 2023-02 900

You want to find the running sum (SUM()) of sales by month for each region. You might write a query like this:

SELECT
    region,
    month,
    SUM(total) OVER (ORDER BY month) AS running_total
FROM 
    sales;

Result:

region month running_total
North 2023-01 1000
South 2023-01 1800
North 2023-02 3000
South 2023-02 3900

At first glance, it looks fine. But the result is clearly not what you want, because the running sum is calculated for all rows together, not by region. The problem is, we forgot to add PARTITION BY region.

Fixed code:

SELECT 
    region,
    month,
    SUM(total) OVER (PARTITION BY region ORDER BY month) AS running_total
FROM 
    sales;

Result:

region month running_total
North 2023-01 1000
North 2023-02 2200
South 2023-01 800
South 2023-02 1700

Now it works as expected: the data is grouped by region, and the running sum is calculated separately for each region.

Mistake #2: Wrong Order in ORDER BY

ORDER BY inside OVER() controls the order of rows in the window. If you get the order wrong, your results will be weird.

You want to find running totals of sales, ordered by month in descending order. You might write this query:

SELECT
    month,
    total,
    SUM(total) OVER (ORDER BY month DESC) AS running_total
FROM 
    sales;

Result:

month total running_total
2023-02 1200 1200
2023-02 900 2100
2023-01 1000 3100
2023-01 800 3900

At first glance, it looks right, but notice: the rows are grouped by month, but the sums aren't correct because of the descending order. That's why the results are confusing.

Fix: rewrite the query to use the correct ORDER BY order:

SELECT
    month,
    total,
    SUM(total) OVER (ORDER BY month ASC) AS running_total
FROM 
    sales;

Mistake #3: Using Window Functions Without Indexes

Window functions often work with big chunks of data, and if you don't have indexes on key columns, your performance can tank hard.

Example: we have a large_sales table with millions of rows, and we want to calculate sales ranks:

SELECT
    id,
    total,
    RANK() OVER (ORDER BY total DESC) AS rank
FROM 
    large_sales;

With small data, this query might be fast, but with lots of rows, it could take forever.

Fix: add an index on the column used in ORDER BY:

CREATE INDEX idx_total ON large_sales(total DESC);

Now the query will run way faster.

Mistake #4: Not Understanding the Window Defined by ROWS or RANGE

When using ROWS and RANGE, it's important to get how they define the window of rows. If you misunderstand these keywords, you'll get unexpected results.

Example: you want to calculate a moving average of sales for the current month and the two previous months:

SELECT
    month,
    AVG(total) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM 
    sales;

If you use RANGE instead of ROWS:

SELECT
    month,
    AVG(total) OVER (ORDER BY month RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM 
    sales;

The result will be different, because RANGE works with value ranges, not a specific number of rows.

Mistake #5: Overusing Window Functions

Using a bunch of window functions in one query can cause duplicate calculations and slow things down.

Example:

SELECT 
    id,
    total,
    SUM(total) OVER (PARTITION BY region) AS region_total,
    SUM(total) OVER (PARTITION BY region) / COUNT(total) OVER (PARTITION BY region) AS region_avg
FROM 
    sales;

Here, SUM(total) and COUNT(total) are calculated multiple times for each row.

Fix: make your query shorter using subqueries or a CTE:

WITH cte_region_totals AS (
    SELECT 
        region,
        SUM(total) AS region_total,
        COUNT(total) AS region_count
    FROM 
        sales
    GROUP BY 
        region
)
SELECT 
    s.id,
    s.total,
    t.region_total,
    t.region_total / t.region_count AS region_avg
FROM 
    sales s
JOIN 
    cte_region_totals t ON s.region = t.region;

Tips for Avoiding Mistakes

Double-check PARTITION BY and ORDER BY: always make sure your window is set up right.

Index your data: especially if you're sorting (ORDER BY) or filtering.

Use CTEs for repeated calculations: this helps cut down on duplicate steps.

Look at the execution plan: use EXPLAIN and EXPLAIN ANALYZE to see how PostgreSQL is running your query.

Test on real data: make sure your results match what you expect and your tasks are solved correctly.

2
Task
SQL SELF, level 30, lesson 4
Locked
Using ORDER BY in window functions
Using ORDER BY in window functions
1
Survey/quiz
Window Frame Setup, level 30, lesson 4
Unavailable
Window Frame Setup
Window Frame Setup
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION