How to select unique records by SQL
Get the unique rows in SQL by using the DISTINCT
keyword inside your SELECT
query. For unique values in a single column, such as email
, use:
To retrieve unique rows combined across several columns, such as first_name
and last_name
, use:
These queries avoid duplicate values, with the second one treating uniqueness as a composite of the two columns.
Getting the hang of DISTINCT
Utilizing DISTINCT
is a piece of 🍰 for basic cases. Here's how to handle joins and multiple columns, because we don't want those pesky duplicates:
This query guarantees unique row combinations when joining tables, keeping the data clean and crisp.
Meet the GROUP BY command
When working with aggregate functions or large datasets, GROUP BY
walks in like a boss:
This query returns the population of each city in the customers
table. GROUP BY
clubs rows with the same column value into a single group.
ROW_NUMBER(): the secret sauce
Need spicier uniqueness? Window functions coupled with ROW_NUMBER()
within a Common Table Expression (CTE) can save the day:
This beast of a query tags customers based on their signup date, then selects those rocking the most recent unique emails.
Mastering SQL's dialect
Every SQL dialect has unique twists. The DISTINCT ON
operation for PostgreSQL returns unique events based on location:
When in doubt, always validate your approach against the SQL syntax of the database management system (DBMS) you're using.
Addressing performance speed bumps
When handling large datasets, size does matter. Compare the efficiency of DISTINCT
against GROUP BY
or subqueries:
Although a column with an index will generally make DISTINCT
run faster, testing is key when deciding between using DISTINCT
, GROUP BY
, or subqueries.
Existing in the real world
In the real world, concrete use-cases guide your choice of columns in DISTINCT
:
In a scenario where different customers share the same address, choosing addresses indiscriminately with DISTINCT
could bind them. Examine your data model with a fine-tooth comb.
Juggling multiple columns
When handling compound keys—combinations of columns that should be unique together—DISTINCT
is still your friend:
This query gives you unique city records for each country.
Was this article helpful?