Here's something we haven't talked about yet: how do you filter groups after applying aggregates? Sometimes you don't need all the faculties—just the ones with more than a hundred students. Or maybe you only care about departments where the average salary is above 50,000. Today, we're gonna check out how to filter aggregated data using HAVING.
Why do we need HAVING if we already have WHERE? Can't we just slap a WHERE after GROUP BY? :)
It's not that simple! First off, the order of SQL operators is fixed, and WHERE gets executed before GROUP BY.
But what if we just move it after GROUP BY?
Nope, that doesn't work either! A lot of the time, you need to filter table rows before grouping. Then you group the filtered data. And then you might want to toss out some groups you don't need after grouping.
So maybe we just take the WHERE operator, copy it, call it HAVING, and put it after GROUP BY?
Yep, that's exactly what we do! :)
Difference between HAVING and WHERE
WHERE filters rows before grouping.
Imagine you're picking cakes by flavor: you keep the strawberry and chocolate ones, and put the rest aside. That's a job for WHERE.
HAVING filters after the data has been grouped and aggregate functions have done their magic.
For example, you've already grouped cakes by tables, counted how many there are, and now you only want to keep the tables with more than three cakes.
So, HAVING is used to filter data at the group level.
Syntax of HAVING
The syntax is almost the same as WHERE, but it works a bit differently:
SELECT columns, aggregate_functions
FROM table
GROUP BY columns
HAVING condition;
Execution steps:
- First, rows are filtered with
WHERE. - Then the data is grouped using
GROUP BY. - Aggregate functions are applied to the grouped results.
- Finally, the resulting groups are filtered with
HAVING.
Examples of using HAVING
Example 1: Filtering faculties with a large number of students
You want to know which university faculties have more than 100 students. Let's say we have a students table:
| id | name | faculty |
|---|---|---|
| 1 | Alice | Engineering |
| 2 | Bob | Engineering |
| 3 | Charlie | Arts |
| 4 | Daisy | Business |
| 5 | ... | ... |
Query:
SELECT faculty, COUNT(*) AS student_count
FROM students
GROUP BY faculty
HAVING COUNT(*) > 100;
What's going on here:
- First, we group students by the
facultycolumn usingGROUP BY. - Then the aggregate function
COUNT(*)counts the number of students in each faculty. - Finally,
HAVINGtosses out all faculties with 100 or fewer students.
Result:
| faculty | student_count |
|---|---|
| Engineering | 150 |
| Arts | 120 |
Example 2: Departments with high average salary
You want to find only those departments where the average employee salary is over 50,000. Let's say we have an employees table:
| id | name | department | salary |
|---|---|---|---|
| 1 | Alice | IT | 60000 |
| 2 | Bob | HR | 45000 |
| 3 | Charlie | IT | 70000 |
| 4 | Daisy | HR | 52000 |
| 5 | ... | ... | ... |
Query:
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
Result:
| department | avg_salary |
|---|---|
| IT | 65000 |
Notice: HAVING works with results that were calculated after GROUP BY.
Order of execution: WHERE, GROUP BY, and HAVING
Filtering with WHERE and HAVING happens at different stages. To get a better feel for the difference, let's look at the step-by-step process of a query:
WHERE: row filtering.At this stage, all table rows are processed. If a row doesn't pass the
WHEREcondition, it doesn't go any further.GROUP BY: grouping rows.After filtering, rows are grouped based on the columns listed in
GROUP BY.Aggregate functions:
Aggregate functions like
COUNT(),AVG(),SUM(), etc. are applied to the grouped data.HAVING: filtering groups.At this stage, only the aggregate results are processed.
HAVINGconditions are applied only to groups.
Features of HAVING
Feature 1: Working with aggregates
The main difference between HAVING and WHERE is working with aggregate functions. For example:
SELECT department, AVG(salary)
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
In this query, you can't use AVG(salary) inside WHERE, because WHERE processes rows before grouping. A query like:
SELECT department, AVG(salary)
FROM employees
WHERE AVG(salary) > 50000
GROUP BY department;
will throw an error: aggregate functions are not allowed in WHERE.
Feature 2: Filtering without grouping
You can use HAVING even without an explicit GROUP BY. In this case, the query is treated as if there's just one group—all the records:
SELECT AVG(salary) AS avg_salary
FROM employees
HAVING AVG(salary) > 50000;
Practical example
Let's say we have a store and a sales table:
| id | product_id | sales_amount |
|---|---|---|
| 1 | 101 | 200.00 |
| 2 | 102 | 300.00 |
| 3 | 101 | 400.00 |
| 4 | 103 | 150.00 |
Query: find products with total sales over 500.
SELECT product_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY product_id
HAVING SUM(sales_amount) > 500;
Result:
| product_id | total_sales |
|---|---|
| 101 | 600.00 |
Common mistakes
Using aggregates in WHERE:
For example:
SELECT department, AVG(salary)
FROM employees
WHERE AVG(salary) > 50000
GROUP BY department;
Error: you can't use aggregate functions in WHERE.
Mistakes with NULL:
If your data has NULLs, filtering can give you weird results. For example:
SELECT department, SUM(salary)
FROM employees
GROUP BY department
HAVING SUM(salary) > 0;
If the salary column only has NULLs, the result might be zero or empty.
Congrats! At this point, you can confidently filter aggregated data! Don't forget, HAVING is your key to group-level analytics, where regular WHERE just isn't enough.
GO TO FULL VERSION