Menu

SQLite DISTINCT: Removing Duplicates from SELECT Queries

How SELECT DISTINCT works in SQLite — single columns, multiple columns, NULL handling, COUNT(DISTINCT), and when GROUP BY is the better tool.

This page includes runnable editors — edit, run, and see output instantly.

DISTINCT Removes Duplicate Rows

By default, SELECT returns every matching row, duplicates and all. DISTINCT tells SQLite to collapse rows that are identical across the columns you selected, so each unique combination shows up only once.

Five rows in, three rows out. SQLite looked at the customer column, threw out the repeats, and returned one row per unique value. The order isn't guaranteed — add ORDER BY if you care.

DISTINCT Applies to the Whole Select List

This trips people up. DISTINCT doesn't pick one column to deduplicate; it deduplicates whole rows based on every column you selected.

Each unique (customer, country) pair shows up once. If the same customer appeared with two different countries, you'd see both rows — they're not duplicates as far as SQLite is concerned.

There's no DISTINCT(customer) syntax that ignores the other columns. The parentheses look tempting, but SELECT DISTINCT(customer), country is parsed the same as SELECT DISTINCT customer, country — the parentheses just group an expression. If you really want one row per customer with some chosen country, that's a job for GROUP BY plus an aggregate.

COUNT(DISTINCT col)

A common need: how many unique values are in a column? COUNT(*) counts rows, COUNT(col) counts non-NULL values, and COUNT(DISTINCT col) counts unique non-NULL values.

Five orders, three unique customers, three unique countries. COUNT(DISTINCT ...) is the most useful aggregate form of DISTINCT — you'll reach for it any time you want to count "how many different things showed up."

Note that SQLite only allows a single column inside COUNT(DISTINCT ...). To count unique combinations of multiple columns, wrap them in a subquery: SELECT COUNT(*) FROM (SELECT DISTINCT a, b FROM t).

How DISTINCT Treats NULL

NULL has a quirky reputation in SQL because NULL = NULL evaluates to NULL, not TRUE. But DISTINCT makes a special exception: for deduplication, all NULLs are considered equal to each other.

Three rows come back: 'ada@example.com', 'dan@example.com', and a single NULL. The three NULL emails collapsed into one. The same rule applies to GROUP BY and to set operations like UNION — handy to remember when you're hunting down "why is that NULL row appearing once instead of three times?"

DISTINCT Runs Before ORDER BY and LIMIT

The clauses in a SELECT follow a logical order: FROMWHEREGROUP BYHAVINGSELECT/DISTINCTORDER BYLIMIT. So DISTINCT filters duplicates first, then ORDER BY sorts what's left, then LIMIT cuts it down.

WHERE keeps four rows, DISTINCT collapses Boris's duplicates, ORDER BY sorts alphabetically, LIMIT returns the first two. Worth tracing through once — confusion about result ordering usually comes from forgetting which step happens when.

DISTINCT vs GROUP BY

For pure deduplication, these two queries return the same rows:

Same result. The difference is what you can do next:

  • DISTINCT is for "give me unique rows" and nothing else.
  • GROUP BY is for "bucket rows and compute something per bucket" — COUNT(*), SUM(amount), MAX(created_at), and so on.

If you find yourself reaching for DISTINCT and then realizing you also want a per-customer total, that's the signal to switch to GROUP BY:

One row per customer, with the aggregates you wanted. DISTINCT couldn't do this — it has no way to express "one row per group plus a sum."

A Few Things to Watch For

  • Performance. DISTINCT usually requires SQLite to sort or hash the rows to find duplicates. On large result sets, an index on the deduplicated column(s) helps. If you're doing SELECT DISTINCT on every column of a wide table, ask whether you actually need every column.
  • DISTINCT * is rare. It's legal — SELECT DISTINCT * FROM t deduplicates whole rows — but if your table has a primary key, every row is already unique, so this does nothing useful.
  • Don't confuse with UNIQUE. UNIQUE is a constraint on a table that prevents duplicate values from being inserted in the first place. DISTINCT is a query-time filter that hides duplicates in the result. Different tools, different jobs.

Next: CASE Expressions

Once you can shape result rows with SELECT, WHERE, ORDER BY, and DISTINCT, the next step is conditional logic inside a query. CASE expressions let you return different values based on conditions — the SQL equivalent of an if/else ladder, and the next page covers them.

Frequently Asked Questions

How does SELECT DISTINCT work in SQLite?

SELECT DISTINCT removes duplicate rows from the result set. SQLite compares every column in the select list and keeps one row per unique combination. It applies after WHERE and JOIN but before ORDER BY and LIMIT.

Can I use DISTINCT on multiple columns in SQLite?

Yes — DISTINCT always applies to the entire select list, not a single column. SELECT DISTINCT city, country FROM users returns each unique (city, country) pair. There's no syntax for DISTINCT(city) that ignores the other columns; if you need that, use GROUP BY with an aggregate.

How does DISTINCT handle NULL values in SQLite?

DISTINCT treats NULL as equal to other NULLs for deduplication purposes — so multiple rows with NULL collapse into one. This is different from how = works in WHERE clauses, where NULL = NULL is unknown. It's a special rule just for DISTINCT, GROUP BY, and UNION.

What's the difference between DISTINCT and GROUP BY in SQLite?

For deduplication alone, SELECT DISTINCT col and SELECT col FROM t GROUP BY col produce identical results. The difference is intent: use DISTINCT when you just want unique rows, and GROUP BY when you also want to compute aggregates like COUNT(*) or SUM(amount) per group.

Coddy programming languages illustration

Learn to code with Coddy

GET STARTED