Postgresql DISTINCT ON with different ORDER BY

sql

distinct-on

postgresql

sql-performance

byAnton Shumikhin·Jan 16, 2025

In PostgreSQL, you can leverage a subquery to retrieve distinct rows filtered by DISTINCT ON, and then apply a different sort order in the outer query.

Quick SQL Example:

SELECT * FROM (
  SELECT DISTINCT ON (column1) column1, column2
  FROM your_table
  ORDER BY column1, column2 DESC
) sub
ORDER BY column2;

Key Concepts:

DISTINCT ON (column1) filters out unique column1 values.
The inner ORDER BY column1, column2 DESC determines which row gets selected among each group of column1.
The final sort order is determined by the outer ORDER BY column2.

Enough tea? Let's get to the biscuits. 🍪

Optimizing value selection with DISTINCT ON

DISTINCT ON can serve as a powerful tool when your goal is to retrieve the maximum or minimum values within each distinct group. Order the rows of interest within the DISTINCT ON clause, and your job is done!

Getting most recent orders from a list, anyone? 👀

SELECT * FROM (
  SELECT DISTINCT ON (customer_id) customer_id, order_id, order_date
  FROM orders
  ORDER BY customer_id, order_date DESC
) sub
ORDER BY order_date DESC;

Handling complex ordering needs with subqueries

When your order by columns aren't in the DISTINCT ON clause, subqueries are your rescuers. These help you achieve a different order than you used for filtering rows.

Watch this in action:

SELECT * FROM (
  SELECT DISTINCT ON (address_id) address_id, product_id, purchase_date
  FROM purchases
  ORDER BY address_id, purchase_date DESC
) sub
ORDER BY product_id, purchase_date DESC;

Window functions: a savory alternative to DISTINCT ON

DISTINCT ON can be a real star. But it's not always enough. That's when PostgreSQL window functions like ROW_NUMBER(), RANK(), or LAST_VALUE() can step in. These are helpful in defining a window of related rows.

Using window functions to get distinct values:

SELECT column1, column2
FROM (
  SELECT column1, column2,
         ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2 DESC) as rn
  FROM your_table
) sub
WHERE rn = 1
ORDER BY column2;

We've ranked rows within partitions of column1 and picked top-ranked rows from each partition. It's all about being number 1, isn't it?

Subqueries and the performance game

While subqueries are flexible, don't forget they can impact performance. Don't you want optimized query execution? Match, set, go with indexed columns!

Let's look an example:

SELECT * FROM (
  SELECT DISTINCT ON (indexed_column) indexed_column, other_column
  FROM your_table
  ORDER BY indexed_column, other_column DESC
) sub
ORDER BY other_column;

Now with an index on indexed_column, we bet on faster query performance!

The aliasing hack for flexible ordering

When ordering parameters differ from those in the DISTINCT ON clause, aliasing columns can provide the needed flexibility. Your data will maintain its uniqueness but present in the desired order.

Food for thought:

SELECT alias.column1, alias.column2 FROM (
  SELECT DISTINCT ON (column1) column1 as col1, column2 as col2
  FROM your_table
  ORDER BY column1, column2 DESC
) alias
ORDER BY alias.col2;

See? You can use different aliases in subquery and change the ordering in your outer query without hints of confusion! Smooth, isn't it?

Bossing null handling

SQL can be a bit tricky with NULLs in order operations. But don't worry, PostgreSQL has you covered with NULLS FIRST or NULLS LAST.

Defeating NULL-related unexpected Results:

SELECT DISTINCT ON (column1) column1, column2
FROM your_table
ORDER BY column1, column2 DESC NULLS LAST;

explain-codes / Sql / Postgresql DISTINCT ON with different ORDER BY

Linked

Using the DISTINCT keyword causes this error: not a SELECTed expression



How to select only the first rows for each unique value of a column?



Sql query to select distinct row with minimum value



Mysql - SELECT all columns WHERE one column is DISTINCT



How to get the latest record in each group using GROUP BY?



Select a column in SQL not in Group By



How to join only one row in joined table with Postgres?



Optimizing value selection with DISTINCT ON Handling complex ordering needs with subqueries Window functions: a savory alternative to DISTINCT ON Subqueries and the performance game The aliasing hack for flexible ordering Bossing null handling