Explain Codes LogoExplain Codes Logo

Concatenate multiple rows in an array with SQL on PostgreSQL

sql
array_agg
postgresql
data-aggregation
Alex KataevbyAlex Kataev·Oct 10, 2024
TLDR

Instantly merge multiple row values into an array with the array_agg function. It's like your data is going on a reunion and array_agg is the bus driver:

SELECT array_agg(your_column) FROM your_table;

And if you want to have them grouped with a tie-breaker—a key column—no worries:

SELECT primary_key, array_agg(your_column) FROM your_table GROUP BY primary_key;

And here's how you can get your grouped data sorted like star students on result day:

SELECT primary_key, array_agg(your_column ORDER BY another_column) FROM your_table GROUP BY primary_key;

The Inner Workings of array_agg and Data Types

It's essential to understand that array_agg acts like that kid in class who collects everyone's homework. It takes in the values and assembles a nice array. However, be careful with the data types that you're trying to aggregate. Mixing apples with oranges isn't a good idea—even in data aggregation.

Alongside array_agg, consider using generate_series when the wind of randomness blows in your data. It's like having your own bucket list, but for order:

WITH data_gather AS ( SELECT primary_key, your_column, row_number() OVER (PARTITION BY primary_key ORDER BY another_column) as ticket_no FROM your_table ) SELECT primary_key, array_agg(your_column ORDER BY ticket_no) FROM data_gather GROUP BY primary_key;

Here, each data row gets an orderly queue ticket and they're guaranteed to be on the array_agg bus in the right sequence.

Craft Your Concatenation Strategy Like a Pro

Sometimes you may encounter a situation where array_agg just doesn't cut it. This can be a great opportunity to show off your skills. With your very own, homemade, custom aggregate function, you can append elements in a way that array_agg can only dream of. Roll up your sleeves, let's get started:

CREATE FUNCTION array_append_agg(anyarray, anyelement) RETURNS anyarray AS $$ SELECT array_append($1, $2); $$ LANGUAGE sql IMMUTABLE; CREATE AGGREGATE array_accum (anyelement) ( SFUNC = array_append_agg, STYPE = anyarray, INITCOND = '{}' );

You've added another tool in your toolbelt. Now go on and test drive this baby with your data. Error handling isn't just a phrase in your job description — it comes in handy in the wild.

Setting Up Your Sorting and Error Traps

While sorting your array_agg, adding an identifier is like putting name tags on your data. And if you're keen on having your results in order, be sure to call out to your primary_key in the ORDER BY outside the array_agg:

SELECT key_column, array_agg(column_name) FROM table_name GROUP BY key_column ORDER BY key_column;

array_agg: Your Pocket Tool

array_agg has been around since PostgreSQL 8.4.8, and it became a crowd favourite for pulling a rabbit out of the hat efficiently. There are others like string_agg or manual concatenation methods, but array_agg has earned a place in Postgres users' hearts.

Troubleshooting Your Query

Working with array_agg can be smooth sailing, but remember that array_agg just doesn't like dealing with NULL values. So remember kiddos, NULLs are just the invisible potato salads. Having Cape of Visibility (IS NOT NULL condition) is handy here.

And when using generate_series, make sure the generated series doesn't jack up your query with unintended ordering or gaps.