Runnable SQLite Docs: DISTINCT

Q: How does SELECT DISTINCT work in SQLite?

SELECT DISTINCT removes duplicate rows from the result set. SQLite compares every column in the select list and keeps one row per unique combination. It applies after WHERE and JOIN but before ORDER BY and LIMIT.

Q: Can I use DISTINCT on multiple columns in SQLite?

Yes - DISTINCT always applies to the entire select list, not a single column. SELECT DISTINCT city, country FROM users returns each unique (city, country) pair. There's no syntax for DISTINCT(city) that ignores the other columns; if you need that, use GROUP BY with an aggregate.

Q: How does DISTINCT handle NULL values in SQLite?

DISTINCT treats NULL as equal to other NULLs for deduplication purposes - so multiple rows with NULL collapse into one. This is different from how = works in WHERE clauses, where NULL = NULL is unknown. It's a special rule just for DISTINCT, GROUP BY, and UNION.

Q: What's the difference between DISTINCT and GROUP BY in SQLite?

For deduplication alone, SELECT DISTINCT col and SELECT col FROM t GROUP BY col produce identical results. The difference is intent: use DISTINCT when you just want unique rows, and GROUP BY when you also want to compute aggregates like COUNT(*) or SUM(amount) per group.

DISTINCT Removes Duplicate Rows

By default, SELECT returns every matching row, duplicates and all. DISTINCT tells SQLite to collapse rows that are identical across the columns you selected, so each unique combination shows up only once.

CREATE TABLE orders (id INTEGER, customer TEXT, country TEXT);
INSERT INTO orders VALUES
    (1, 'Ada',   'UK'),
    (2, 'Boris', 'FR'),
    (3, 'Ada',   'UK'),
    (4, 'Cleo',  'US'),
    (5, 'Boris', 'FR');

SELECT DISTINCT customer FROM orders;

Five rows in, three rows out. SQLite looked at the customer column, threw out the repeats, and returned one row per unique value. The order isn't guaranteed - add ORDER BY if you care.

DISTINCT Applies to the Whole Select List

This trips people up. DISTINCT doesn't pick one column to deduplicate; it deduplicates whole rows based on every column you selected.

CREATE TABLE orders (id INTEGER, customer TEXT, country TEXT);
INSERT INTO orders VALUES
    (1, 'Ada',   'UK'),
    (2, 'Boris', 'FR'),
    (3, 'Ada',   'UK'),
    (4, 'Cleo',  'US'),
    (5, 'Boris', 'FR');

SELECT DISTINCT customer, country FROM orders;

Each unique (customer, country) pair shows up once. If the same customer appeared with two different countries, you'd see both rows - they're not duplicates as far as SQLite is concerned.

There's no DISTINCT(customer) syntax that ignores the other columns. The parentheses look tempting, but SELECT DISTINCT(customer), country is parsed the same as SELECT DISTINCT customer, country - the parentheses just group an expression. If you really want one row per customer with some chosen country, that's a job for GROUP BY plus an aggregate.

COUNT(DISTINCT col)

A common need: how many unique values are in a column? COUNT(*) counts rows, COUNT(col) counts non-NULL values, and COUNT(DISTINCT col) counts unique non-NULL values.

CREATE TABLE orders (id INTEGER, customer TEXT, country TEXT);
INSERT INTO orders VALUES
    (1, 'Ada',   'UK'),
    (2, 'Boris', 'FR'),
    (3, 'Ada',   'UK'),
    (4, 'Cleo',  'US'),
    (5, 'Boris', 'FR');

SELECT
    COUNT(*)                  AS total_orders,
    COUNT(DISTINCT customer)  AS unique_customers,
    COUNT(DISTINCT country)   AS unique_countries
FROM orders;

Five orders, three unique customers, three unique countries. COUNT(DISTINCT ...) is the most useful aggregate form of DISTINCT - you'll reach for it any time you want to count "how many different things showed up."

Note that SQLite only allows a single column inside COUNT(DISTINCT ...). To count unique combinations of multiple columns, wrap them in a subquery: SELECT COUNT(*) FROM (SELECT DISTINCT a, b FROM t).

How DISTINCT Treats NULL

NULL has a quirky reputation in SQL because NULL = NULL evaluates to NULL, not TRUE. But DISTINCT makes a special exception: for deduplication, all NULLs are considered equal to each other.

CREATE TABLE contacts (name TEXT, email TEXT);
INSERT INTO contacts VALUES
    ('Ada',   'ada@example.com'),
    ('Boris', NULL),
    ('Cleo',  NULL),
    ('Dan',   'dan@example.com'),
    ('Eve',   NULL);

SELECT DISTINCT email FROM contacts;

Three rows come back: 'ada@example.com', 'dan@example.com', and a single NULL. The three NULL emails collapsed into one. The same rule applies to GROUP BY and to set operations like UNION - handy to remember when you're hunting down "why is that NULL row appearing once instead of three times?"

DISTINCT Runs Before ORDER BY and LIMIT

The clauses in a SELECT follow a logical order: FROM → WHERE → GROUP BY → HAVING → SELECT/DISTINCT → ORDER BY → LIMIT. So DISTINCT filters duplicates first, then ORDER BY sorts what's left, then LIMIT cuts it down.

CREATE TABLE orders (id INTEGER, customer TEXT, amount REAL);
INSERT INTO orders VALUES
    (1, 'Ada',   12.0),
    (2, 'Boris', 30.0),
    (3, 'Ada',   45.0),
    (4, 'Cleo',  20.0),
    (5, 'Boris', 30.0);

SELECT DISTINCT customer
FROM orders
WHERE amount > 15
ORDER BY customer
LIMIT 2;

WHERE keeps four rows, DISTINCT collapses Boris's duplicates, ORDER BY sorts alphabetically, LIMIT returns the first two. Worth tracing through once - confusion about result ordering usually comes from forgetting which step happens when.

DISTINCT vs GROUP BY

For pure deduplication, these two queries return the same rows:

CREATE TABLE orders (id INTEGER, customer TEXT, amount REAL);
INSERT INTO orders VALUES
    (1, 'Ada',   12.0),
    (2, 'Boris', 30.0),
    (3, 'Ada',   45.0),
    (4, 'Cleo',  20.0);

SELECT DISTINCT customer FROM orders;
SELECT customer FROM orders GROUP BY customer;

Same result. The difference is what you can do next:

DISTINCT is for "give me unique rows" and nothing else.
GROUP BY is for "bucket rows and compute something per bucket" - COUNT(*), SUM(amount), MAX(created_at), and so on.

If you find yourself reaching for DISTINCT and then realizing you also want a per-customer total, that's the signal to switch to GROUP BY:

CREATE TABLE orders (id INTEGER, customer TEXT, amount REAL);
INSERT INTO orders VALUES
    (1, 'Ada',   12.0),
    (2, 'Boris', 30.0),
    (3, 'Ada',   45.0),
    (4, 'Cleo',  20.0);

SELECT customer, COUNT(*) AS orders, SUM(amount) AS total
FROM orders
GROUP BY customer
ORDER BY total DESC;

One row per customer, with the aggregates you wanted. DISTINCT couldn't do this - it has no way to express "one row per group plus a sum."

A Few Things to Watch For

Performance. DISTINCT usually requires SQLite to sort or hash the rows to find duplicates. On large result sets, an index on the deduplicated column(s) helps. If you're doing SELECT DISTINCT on every column of a wide table, ask whether you actually need every column.
DISTINCT * is rare. It's legal - SELECT DISTINCT * FROM t deduplicates whole rows - but if your table has a primary key, every row is already unique, so this does nothing useful.
Don't confuse with UNIQUE. UNIQUE is a constraint on a table that prevents duplicate values from being inserted in the first place. DISTINCT is a query-time filter that hides duplicates in the result. Different tools, different jobs.

Next: CASE Expressions

Once you can shape result rows with SELECT, WHERE, ORDER BY, and DISTINCT, the next step is conditional logic inside a query. CASE expressions let you return different values based on conditions - the SQL equivalent of an if/else ladder, and the next page covers them.

Frequently Asked Questions

How does SELECT DISTINCT work in SQLite?

SELECT DISTINCT removes duplicate rows from the result set. SQLite compares every column in the select list and keeps one row per unique combination. It applies after WHERE and JOIN but before ORDER BY and LIMIT.

Can I use DISTINCT on multiple columns in SQLite?

Yes - DISTINCT always applies to the entire select list, not a single column. SELECT DISTINCT city, country FROM users returns each unique (city, country) pair. There's no syntax for DISTINCT(city) that ignores the other columns; if you need that, use GROUP BY with an aggregate.

How does DISTINCT handle NULL values in SQLite?

DISTINCT treats NULL as equal to other NULLs for deduplication purposes - so multiple rows with NULL collapse into one. This is different from how = works in WHERE clauses, where NULL = NULL is unknown. It's a special rule just for DISTINCT, GROUP BY, and UNION.

What's the difference between DISTINCT and GROUP BY in SQLite?

For deduplication alone, SELECT DISTINCT col and SELECT col FROM t GROUP BY col produce identical results. The difference is intent: use DISTINCT when you just want unique rows, and GROUP BY when you also want to compute aggregates like COUNT(*) or SUM(amount) per group.

Related concepts

SQLite DISTINCT: Removing Duplicates from SELECT Queries