Explain Codes LogoExplain Codes Logo

How to aggregate values into an array in PostgreSQL?

sql
array-aggregation
postgresql
data-mining
Alex KataevbyAlex KataevΒ·Nov 14, 2024
⚑TLDR

Here's the bread and butter of aggregating values into an array in PostgreSQL:

-- Basic array aggregation SELECT array_agg(column_name) FROM table_name;

To bundle related values into a single array per group:

-- Grouping and array aggregation SELECT group_column, array_agg(column_name) FROM table_name GROUP BY group_column;

For sorted arrays that care about order:

-- Sorted arrays - delicate operation, handle with care! πŸ˜‰ SELECT group_column, array_agg(column_name ORDER BY sort_column) FROM table_name GROUP BY group_column;

These three code blocks are your swiss army knife for creating and organizing arrays in PostgreSQL.

Dealing with Student-Grade relationship

Suppose we have Student and Grade tables. We want student grades clumped together in an array:

-- Behold! Our innocent tables. CREATE TABLE Student ( Id SERIAL PRIMARY KEY, Name VARCHAR(100) ); CREATE TABLE Grade ( Id SERIAL PRIMARY KEY, StudentId INT REFERENCES Student(Id), Mark INT );

Merging grades into a single array per student:

-- Surgery time! Merging grades into a single row (students will be thrilled!). SELECT s.Name AS StudentName, array_agg(g.Mark ORDER BY g.Mark) AS Marks FROM Student s LEFT JOIN Grade g ON s.Id = g.StudentId GROUP BY s.Id , s.Name; -- Don't forget to group by name as well!

Subqueries for additional control

For times when ordinary joins don't cut it, use subqueries to assemble arrays:

-- It's like playing 'Operation' with your data. SELECT s.Name AS StudentName, (SELECT array_agg(Mark) FROM Grade WHERE StudentId = s.Id AND Mark > 75) AS AwesomeMarks FROM Student s;

This will yield only high marks. Sorry, no free passes for students!

Text Aggregation and Conversion

If you're more of a "strings attached" person, STRING_AGG() is your buddy:

-- Going stringy for a change. SELECT group_column, STRING_AGG(column_name::text, ', ' ORDER BY sort_column) FROM table_name GROUP BY group_column;

Or to ASCII-fy an array:

-- Abra-cadabra, Array-to-Stringa! SELECT group_column, array_to_string(array_agg(column_name), ', ') FROM table_name GROUP BY group_column;

Different tools for different rules!

Keeping Order with Data

The ORDER BY clause within array_agg() maintains sequence:

-- Chronological ordering. As if writing a diary. 😊 SELECT UserId, array_agg(Amount ORDER BY TransDate) AS OrderedAmounts FROM Transactions GROUP BY UserId;

Handling Nulls and Duplicates in Style

Null values and duplicates - the necessary evils. PostgreSQL's array_agg() includes them by default. However, you can give them a slip:

-- DO NOT PASS (null values)! SELECT group_column, array_agg(column_name) FILTER (WHERE column_name IS NOT NULL) FROM table_name GROUP BY group_column;

Getting rid of sneaky duplicates:

-- Shoo duplicates, don't bother us! SELECT group_column, array_agg(DISTINCT column_name) FROM table_name GROUP BY group_column;

A clean array is a contributor to peaceful data mining!