Explain Codes LogoExplain Codes Logo

Sql query to select distinct row with minimum value

sql
window-functions
sql-queries
performance-optimization
Anton ShumikhinbyAnton Shumikhin·Dec 3, 2024
TLDR

Getting results fast? You can use MIN() within a GROUP BY clause, and then join that with your original table. Let's break this down with an example. Consider a table orders, we want to find the order that has the lowest price for each product:

SELECT o1.product_id, o1.price FROM orders o1 JOIN ( SELECT product_id, MIN(price) AS min_price FROM orders GROUP BY product_id ) o2 ON o1.product_id = o2.product_id AND o1.price = o2.min_price;

Here's a quick translation for SQL beginners: "We are picking the product that has the lowest price from a list of offers."

A deep dive into window functions

Casting the magic with ROW_NUMBER()

In the world of SQL, ROW_NUMBER() is like your magic wand. Paired with OVER clause and a PARTITION BY, you've got yourself a powerful spell:

SELECT product_id, price FROM ( SELECT product_id, price, ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY price ASC) AS rn -- We ask SQL fairy to number each row within the product partition. FROM orders ) ranked WHERE rn = 1; -- Leaving only the Chosen Ones (rn=1) to survive the query.

DISTINCT ON to the rescue (PostgreSQL)

Riding a PostgreSQL broom? Try DISTINCT ON. A quick and clean way to get a distinct result. Just be careful of wind (compatibility) changes!

SELECT DISTINCT ON (product_id) product_id, price FROM orders ORDER BY product_id, price ASC;

Good ol' subqueries, in style

When in doubt, bring out the good ol' subqueries. This method is like your garden gnome, nuggety but reliable, present in most SQL implementations:

SELECT * FROM orders o1 WHERE price = ( SELECT MIN(price) FROM orders o2 WHERE o1.product_id = o2.product_id );

Handling multiple groupings like a champ

Enforcing group by on multiple criteria

When the battle arena is formed not by one, but by two or more columns like id and game, create a plan of action with a combination of grouping and precise strikes:

SELECT o1.id, o1.game, o1.point FROM scores o1 JOIN ( SELECT id, game, MIN(point) AS min_point FROM scores GROUP BY id, game -- Grouping two columns like... gulp... peanut butter and jelly. ) o2 ON o1.id = o2.id AND o1.game = o2.game AND o1.point = o2.min_point;

FIRST_VALUE: 'cause first impressions count

Sometimes we care about the first occurrence within a group. Turns out SQL is just like us, giving importance to first impressions:

SELECT DISTINCT id, game, FIRST_VALUE(point) OVER (PARTITION BY id, game ORDER BY point ASC) AS min_point -- It's like SQL is playing "First Point Wins". FROM scores;

Troubleshooting common issues

Keep an eye on the clock

Remember, execution time ⏰ matters. Especially, with larger datasets, JOINs and SUBQUERYs can be slower.

SQL is a bit dialectic, learn to translate

Check 😎 the compatibility of your SQL queries across different SQL dialects.

A tie is like a cliffhanger, handle with care

When you encounter a tie situation 🤼, use multiple columns in ORDER BY for a clear resolution.