Runnable SQLite Docs: GROUP BY & HAVING

Q: What is the difference between WHERE and HAVING in SQLite?

WHERE filters individual rows before they're grouped. HAVING filters groups after aggregation. So WHERE amount > 100 keeps only rows over 100, while HAVING SUM(amount) > 100 keeps only groups whose total is over 100. Aggregate functions like COUNT or SUM are not allowed in WHERE - that's what HAVING is for.

Q: Can you use HAVING without GROUP BY in SQLite?

Yes. Without GROUP BY, SQLite treats the whole result set as a single group, and HAVING filters that one group as a unit. The query either returns one row or no rows. It's rare in practice - usually if you have a HAVING, you have a GROUP BY to go with it.

Q: How do I filter groups by COUNT in SQLite?

Put the aggregate in HAVING, not WHERE. For example, SELECT customerid, COUNT() FROM orders GROUP BY customerid HAVING COUNT() > 1 returns customers with more than one order. You can also reference a column alias from the SELECT list inside HAVING in SQLite.

GROUP BY Collapses Rows Into Buckets

Aggregate functions like COUNT, SUM, and AVG reduce many rows to one number. GROUP BY lets you do that per category - one number per customer, per month, per status. Each unique value (or combination of values) becomes a single row in the result.

CREATE TABLE orders (
    id INTEGER PRIMARY KEY,
    customer TEXT,
    amount REAL
);

INSERT INTO orders (customer, amount) VALUES
    ('Ada',   50.00),
    ('Ada',   30.00),
    ('Boris', 80.00),
    ('Boris', 20.00),
    ('Boris', 15.00),
    ('Cleo', 200.00);

SELECT customer, COUNT(*) AS order_count, SUM(amount) AS total
FROM orders
GROUP BY customer;

Three customers, three rows out. The six original rows are gone - they've been collapsed into per-customer buckets, with COUNT(*) and SUM(amount) calculated within each one.

The mental model: GROUP BY customer says "treat all rows with the same customer as one group." Aggregates then operate on each group separately.

What You Can Put in the SELECT List

This trips people up. When you use GROUP BY, every column in the SELECT list must either be in the GROUP BY clause or be inside an aggregate function. Otherwise the value is ambiguous - which row from the group should it come from?

CREATE TABLE sales (
    region TEXT,
    rep TEXT,
    amount REAL
);

INSERT INTO sales VALUES
    ('North', 'Ada', 100),
    ('North', 'Boris', 200),
    ('South', 'Cleo', 150);

-- This works: region is grouped, amount is aggregated.
SELECT region, SUM(amount) AS total
FROM sales
GROUP BY region;

If you wrote SELECT region, rep, SUM(amount) with GROUP BY region, SQLite would happily run it (it's lenient where other databases reject this), but rep would be picked arbitrarily from the group. You'd get one rep name per region with no guarantee which one. Don't rely on that - group by every non-aggregated column you display.

HAVING Filters Groups After Aggregation

WHERE filters rows before grouping. HAVING filters groups after grouping. That's the whole distinction, and it's why you can't put COUNT(*) > 1 in a WHERE clause - at the time WHERE runs, the count doesn't exist yet.

CREATE TABLE orders (
    id INTEGER PRIMARY KEY,
    customer TEXT,
    amount REAL
);

INSERT INTO orders (customer, amount) VALUES
    ('Ada', 50), ('Ada', 30),
    ('Boris', 80), ('Boris', 20), ('Boris', 15),
    ('Cleo', 200);

SELECT customer, COUNT(*) AS order_count
FROM orders
GROUP BY customer
HAVING COUNT(*) > 1;

Cleo placed only one order, so her group is filtered out. Ada and Boris remain. The condition runs against each group's aggregated value, not against individual rows.

You can reference column aliases from the SELECT list directly in HAVING - SQLite allows it:

CREATE TABLE orders (
    id INTEGER PRIMARY KEY,
    customer TEXT,
    amount REAL
);

INSERT INTO orders (customer, amount) VALUES
    ('Ada',   50.00),
    ('Ada',   30.00),
    ('Boris', 80.00),
    ('Boris', 20.00),
    ('Boris', 15.00),
    ('Cleo', 200.00);

SELECT customer, SUM(amount) AS total
FROM orders
GROUP BY customer
HAVING total >= 100;

That's often more readable than repeating SUM(amount) in the HAVING clause.

WHERE vs HAVING: Use Both Together

The two clauses aren't either/or. WHERE narrows down which rows participate in the grouping; HAVING narrows down which groups make it to the output. Most real queries use both.

CREATE TABLE orders (
    id INTEGER PRIMARY KEY,
    customer TEXT,
    amount REAL,
    status TEXT
);

INSERT INTO orders (customer, amount, status) VALUES
    ('Ada', 50, 'paid'),
    ('Ada', 30, 'refunded'),
    ('Boris', 80, 'paid'),
    ('Boris', 20, 'paid'),
    ('Cleo', 200, 'paid'),
    ('Cleo', 50, 'refunded');

SELECT customer, SUM(amount) AS paid_total
FROM orders
WHERE status = 'paid'
GROUP BY customer
HAVING SUM(amount) > 75;

Read it top-to-bottom in execution order:

WHERE status = 'paid' - drop refunded rows entirely.
GROUP BY customer - bucket what's left by customer.
SUM(amount) runs per group.
HAVING SUM(amount) > 75 - keep only groups that pass.

Boris (80 + 20 = 100) and Cleo (200) survive. Ada's only paid order was 50, which doesn't meet the threshold.

Multiple Conditions and Multiple Group Columns

HAVING accepts the same boolean operators as WHERE - AND, OR, NOT - and you can group by more than one column to get sub-buckets:

CREATE TABLE sales (
    region TEXT,
    quarter TEXT,
    amount REAL
);

INSERT INTO sales VALUES
    ('North', 'Q1', 100), ('North', 'Q1', 50),
    ('North', 'Q2', 300),
    ('South', 'Q1', 80),
    ('South', 'Q2', 120), ('South', 'Q2', 60);

SELECT region, quarter, SUM(amount) AS total, COUNT(*) AS deals
FROM sales
GROUP BY region, quarter
HAVING SUM(amount) > 100 AND COUNT(*) >= 2;

Each (region, quarter) pair is a separate group. The HAVING clause requires both a total above 100 and at least two deals. Only ('North', 'Q1') and ('South', 'Q2') qualify.

A Practical Pattern: Finding Duplicates

A GROUP BY ... HAVING COUNT(*) > 1 query is the standard way to find duplicate values in a column:

CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    email TEXT
);

INSERT INTO users (email) VALUES
    ('ada@example.com'),
    ('boris@example.com'),
    ('ada@example.com'),
    ('cleo@example.com'),
    ('boris@example.com');

SELECT email, COUNT(*) AS occurrences
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

Two duplicates surface. From here you'd typically decide whether to merge accounts, add a UNIQUE constraint, or clean up the data - but the discovery query is the same shape every time.

HAVING Without GROUP BY

This is unusual but legal. With no GROUP BY, the entire result set is treated as a single group, and HAVING filters it as a whole - you get either all the aggregated values or nothing:

CREATE TABLE orders (id INTEGER PRIMARY KEY, amount REAL);
INSERT INTO orders (amount) VALUES (50), (30), (80);

SELECT COUNT(*) AS total_orders, SUM(amount) AS revenue
FROM orders
HAVING SUM(amount) > 100;

The single result row appears because the sum is 160. Change the threshold to > 200 and the query returns no rows at all. In practice, you'll almost always pair HAVING with GROUP BY - but it's good to know the language doesn't require it.

Quick Recap

GROUP BY collapses rows into per-key buckets; aggregates run inside each bucket.
Every non-aggregated column in SELECT should appear in GROUP BY.
WHERE filters rows before grouping; HAVING filters groups after.
Aggregates like COUNT(*) and SUM(...) belong in HAVING, never WHERE.
HAVING accepts compound conditions and can reference SELECT aliases.

Next: Foreign Keys

Aggregating a single table is useful, but most real schemas spread data across multiple tables - orders here, customers there, products somewhere else. Foreign keys are how you wire those tables together so the relationships stay consistent. That's the next chapter.

Frequently Asked Questions

What is the difference between WHERE and HAVING in SQLite?

WHERE filters individual rows before they're grouped. HAVING filters groups after aggregation. So WHERE amount > 100 keeps only rows over 100, while HAVING SUM(amount) > 100 keeps only groups whose total is over 100. Aggregate functions like COUNT or SUM are not allowed in WHERE - that's what HAVING is for.

Can you use HAVING without GROUP BY in SQLite?

Yes. Without GROUP BY, SQLite treats the whole result set as a single group, and HAVING filters that one group as a unit. The query either returns one row or no rows. It's rare in practice - usually if you have a HAVING, you have a GROUP BY to go with it.

How do I filter groups by COUNT in SQLite?

Put the aggregate in HAVING, not WHERE. For example, SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id HAVING COUNT(*) > 1 returns customers with more than one order. You can also reference a column alias from the SELECT list inside HAVING in SQLite.

Related concepts

SQLite GROUP BY and HAVING: Filtering Aggregated Results