Mysql - SELECT all columns WHERE one column is DISTINCT

sql

distinct

group-by

subqueries

byAlex Kataev·Nov 27, 2024

To produce a unique record list based on one column with the benefit of getting all other columns, use a self-join:

SELECT t.*
FROM your_table t
INNER JOIN (
  SELECT DISTINCT column
  FROM your_table
) AS unique_values ON t.column = unique_values.column;

In the above code, replace your_table and column with your designated table name and desired column. This solution ensures each distinct value of your chosen column is returned once, along with the accompanying column values in these rows.

Dissecting DISTINCT and GROUP BY

How GROUP BY outshines DISTINCT

We often make the mistake of employing DISTINCT when in real need of GROUP BY. For unique results on a column but fetching all columns, GROUP BY serves better:

SELECT id, day, month, link
FROM your_table
GROUP BY link;

This presents every unique link with its relative id, day, and month. Nevertheless, MySQL might fetch data from distinct rows for each column, a MySQL-specific oddity worth noting.

The magic of window functions: row_number()

For sequentially ordered distinct values, use row_number():

SELECT * FROM (
  SELECT t.*, row_number() OVER (PARTITION BY column ORDER BY id) as row_num
  FROM your_table t
) sub_query
WHERE sub_query.row_num = 1;

This partitions data per distinct column, orders by id, and picks the first row from every partition. Clever, no?

Row filtering through subqueries

At times, row_number() alone is lackluster. Use subqueries for filtering on unique rows:

SELECT * FROM (
  SELECT *, row_number() OVER (PARTITION BY column) as row_seq
  FROM your_table
) AS sub_alias
WHERE row_seq = 1;

Here, the row sequence of row_seq = 1 ensures every unique column value comes with a full row of data.

Be aware of the nuances

MySQL's non-standard SQL features may cause unexpected results, such as hidden columns in GROUP BY. Check your trusty doc or Stack Overflow questions to fully understand, and potentially exploit, these nuances.

Alternative paths: exploring subqueries and IN clause

With subqueries and the IN clause, you can pinpoint distinct values, provided your subquery isn't overwhelmingly large:

SELECT * 
FROM your_table
WHERE id IN (
  SELECT MIN(id) 
  FROM your_table 
  GROUP BY column
);

The above snippet assures distinct column values are selected by gathering the minimum id for every group. It's like getting the smallest fruit from every tree (we are coders, aren't we, I mean why would we do physical work).

Keys for effective querying

Trim the fat and select minimum columns as needed to reduce request loads.
Always pick off the primary key in subqueries for surefire uniqueness.
If you're using GROUP BY, be ready for any row for every distinct value minus any order specifications.
For choosing between DISTINCT and GROUP BY, GROUP BY plays friendly when you need other column data but distinctness for the target variable.

Visualization (now with bonus emojis!)

To better understand, let's say you're at a fruit market. Each fruit basket represents the distinct types of fruit you were told to pick:

Fruit Baskets (🍎🍎🍏🍊🍊🍊): Select DISTINCT types

You end up with:

Your Bag (🍎🍏🍊): Unique fruits, no duplicates!

SQL's DISTINCT clause works the same way:

SELECT DISTINCT column_name FROM table_name;

Just like you selected unique fruits, this SQL statement returns unique values from column_name.

explain-codes / Sql / Mysql - SELECT all columns WHERE one column is DISTINCT

Linked

Mysql Select rows on first occurrence of each unique value



Sql Query Distinct with Row_Number



Get the latest date from grouped MySQL data



Selecting unique values from a column



Efficient latest record query with PostgreSQL



Mysql: Select DISTINCT / UNIQUE, but return all columns?



Is there any difference between GROUP BY and DISTINCT



Dissecting DISTINCT and GROUP BY Alternative paths: exploring subqueries and IN clause Keys for effective querying Visualization (now with bonus emojis!)