Menu
Try in Playground

SQLite GROUP BY and HAVING: Filtering Aggregated Results

How GROUP BY collapses rows into buckets in SQLite, and how HAVING filters those buckets after aggregation — with the WHERE vs HAVING distinction made concrete.

This page includes runnable editors — edit, run, and see output instantly.

GROUP BY Collapses Rows Into Buckets

Aggregate functions like COUNT, SUM, and AVG reduce many rows to one number. GROUP BY lets you do that per category — one number per customer, per month, per status. Each unique value (or combination of values) becomes a single row in the result.

Three customers, three rows out. The six original rows are gone — they've been collapsed into per-customer buckets, with COUNT(*) and SUM(amount) calculated within each one.

The mental model: GROUP BY customer says "treat all rows with the same customer as one group." Aggregates then operate on each group separately.

What You Can Put in the SELECT List

This trips people up. When you use GROUP BY, every column in the SELECT list must either be in the GROUP BY clause or be inside an aggregate function. Otherwise the value is ambiguous — which row from the group should it come from?

If you wrote SELECT region, rep, SUM(amount) with GROUP BY region, SQLite would happily run it (it's lenient where other databases reject this), but rep would be picked arbitrarily from the group. You'd get one rep name per region with no guarantee which one. Don't rely on that — group by every non-aggregated column you display.

HAVING Filters Groups After Aggregation

WHERE filters rows before grouping. HAVING filters groups after grouping. That's the whole distinction, and it's why you can't put COUNT(*) > 1 in a WHERE clause — at the time WHERE runs, the count doesn't exist yet.

Cleo placed only one order, so her group is filtered out. Ada and Boris remain. The condition runs against each group's aggregated value, not against individual rows.

You can reference column aliases from the SELECT list directly in HAVING — SQLite allows it:

That's often more readable than repeating SUM(amount) in the HAVING clause.

WHERE vs HAVING: Use Both Together

The two clauses aren't either/or. WHERE narrows down which rows participate in the grouping; HAVING narrows down which groups make it to the output. Most real queries use both.

Read it top-to-bottom in execution order:

  1. WHERE status = 'paid' — drop refunded rows entirely.
  2. GROUP BY customer — bucket what's left by customer.
  3. SUM(amount) runs per group.
  4. HAVING SUM(amount) > 75 — keep only groups that pass.

Boris (80 + 20 = 100) and Cleo (200) survive. Ada's only paid order was 50, which doesn't meet the threshold.

Multiple Conditions and Multiple Group Columns

HAVING accepts the same boolean operators as WHEREAND, OR, NOT — and you can group by more than one column to get sub-buckets:

Each (region, quarter) pair is a separate group. The HAVING clause requires both a total above 100 and at least two deals. Only ('North', 'Q1') and ('South', 'Q2') qualify.

A Practical Pattern: Finding Duplicates

A GROUP BY ... HAVING COUNT(*) > 1 query is the standard way to find duplicate values in a column:

Two duplicates surface. From here you'd typically decide whether to merge accounts, add a UNIQUE constraint, or clean up the data — but the discovery query is the same shape every time.

HAVING Without GROUP BY

This is unusual but legal. With no GROUP BY, the entire result set is treated as a single group, and HAVING filters it as a whole — you get either all the aggregated values or nothing:

The single result row appears because the sum is 160. Change the threshold to > 200 and the query returns no rows at all. In practice, you'll almost always pair HAVING with GROUP BY — but it's good to know the language doesn't require it.

Quick Recap

  • GROUP BY collapses rows into per-key buckets; aggregates run inside each bucket.
  • Every non-aggregated column in SELECT should appear in GROUP BY.
  • WHERE filters rows before grouping; HAVING filters groups after.
  • Aggregates like COUNT(*) and SUM(...) belong in HAVING, never WHERE.
  • HAVING accepts compound conditions and can reference SELECT aliases.

Next: Foreign Keys

Aggregating a single table is useful, but most real schemas spread data across multiple tables — orders here, customers there, products somewhere else. Foreign keys are how you wire those tables together so the relationships stay consistent. That's the next chapter.

Frequently Asked Questions

What is the difference between WHERE and HAVING in SQLite?

WHERE filters individual rows before they're grouped. HAVING filters groups after aggregation. So WHERE amount > 100 keeps only rows over 100, while HAVING SUM(amount) > 100 keeps only groups whose total is over 100. Aggregate functions like COUNT or SUM are not allowed in WHERE — that's what HAVING is for.

Can you use HAVING without GROUP BY in SQLite?

Yes. Without GROUP BY, SQLite treats the whole result set as a single group, and HAVING filters that one group as a unit. The query either returns one row or no rows. It's rare in practice — usually if you have a HAVING, you have a GROUP BY to go with it.

How do I filter groups by COUNT in SQLite?

Put the aggregate in HAVING, not WHERE. For example, SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id HAVING COUNT(*) > 1 returns customers with more than one order. You can also reference a column alias from the SELECT list inside HAVING in SQLite.

Coddy programming languages illustration

Learn to code with Coddy

GET STARTED