Explain Codes LogoExplain Codes Logo

Postgres - Function to return the intersection of 2 ARRAYs?

sql
array-operations
postgresql
intersection
Alex KataevbyAlex Kataev·Jan 8, 2025
TLDR

To swiftly calculate the intersection of two arrays in PostgreSQL, a combo of array_agg, unnest, and INTERSECT is your lifeline. Here's the magic spell:

SELECT array_agg(intersect_element) FROM ( SELECT unnest(first_array) intersect_element -- Unnest the first array, like unraveling a ball of yarn in a cat's playground INTERSECT SELECT unnest(second_array) -- Do the same for the second array, because why should the first array have all the fun? ) as sub;

Replace first_array and second_array with your actual arrays, and voila! You get a single array that represents the long-lost siblings, aka common elements.

Beyond the looking glass: advanced intersection tricks

PostgreSQL 8.4 and later versions are like a Swiss knife in your dev toolkit, particularly when you're looking to find intersection of two arrays. So without further ado, let's roll up our sleeves and see what's under the hood:

Custom function for intersection: Because why not?

The quick fix can indeed patch up things pretty well for a one-time gig, but if the intersection of arrays is your bread and butter, you need a more robust solution. Here's how to create a custom function to ease your life:

CREATE OR REPLACE FUNCTION array_intersection(anyarray, anyarray) RETURNS anyarray AS $$ -- Step right up folks! Watch as we transform two plain arrays into one telepathically-linked array. Now, who said magic isn't real? SELECT array_agg(intersect_element) FROM ( SELECT unnest($1) intersect_element INTERSECT SELECT unnest($2) ) sub; $$ LANGUAGE sql IMMUTABLE;

Showtime:

SELECT array_intersection('{4,2,6}'::int[], '{2,3,4}'::int[]);

The & operator: Simplicity is the ultimate sophistication

With the intarray extension, Postgres gives you &, a secret weapon which is as simple as a potato, yet as handy as a paperclip:

SELECT ARRAY[1, 4, 2] & ARRAY[2, 3]; -- Hey you, Mr '&'! Yes, you. Meet these charming arrays and let's see what happens next.

Returns {2}, a lonely singleton amidst a crowd. Just remember, this one requires the intarray module to be loaded, like spinach to Popeye.

Wrangling arrays with array operators

Array operators in PostgreSQL are like fully charged power tools. The ANY construct, for instance, can take you places:

SELECT ARRAY( SELECT * FROM UNNEST(array1) WHERE UNNEST = ANY(array2) ); -- There's no such thing as too many parentheses, right?

This is like a compact Swiss knife for finding the intersection of two arrays even when an explicit function refuses to play along.

Tackling the big fishes: large arrays and edge cases

Life's full of surprises, and so is your array data. From large arrays to unique outputs, from optimizing performances to handling specific data types, here's how to cover all your bases:

Intersection with unique outputs: Dealing with the doppelgangers

CREATE OR REPLACE FUNCTION array_unique_intersection(anyarray, anyarray) RETURNS anyarray AS $$ -- Hey twins! I hate to burst the bubble, but we need just one of you here. Sorry! SELECT array_agg(DISTINCT intersect_element) FROM ( SELECT unnest($1) intersect_element INTERSECT SELECT unnest($2) ) sub; $$ LANGUAGE sql IMMUTABLE;

Performance considerations: For when speed matters

With large arrays, your go-to intersection solution may start dragging its feet. Unfortunately, indexes won't lend a helping hand here. Consider breaking your data into manageable chunks if speed is of the essence.

Unleashing your creativity

Feel free to experiment with different data types and scenarios. After all, variety is the spice of life:

SELECT array_intersection(Array['apple', 'banana'], Array['banana', 'cherry']); -- What's common between a fruit salad and this query? Let's find out!

For complex array generations, the generate_series function is a gold mine. It's perfect for testing or performing intersections on sequence-generated arrays.

Future-proofing your code

PostgreSQL is a moving train, with newer versions bringing more efficient ways to handle array intersections. Stay updated with release notes and community discussions for a smooth coding journey.

Limitations: Every rose has its thorn

Array operations might be power-packed, but they harbor some limitations too, especially with arrays of complex types or when comparisons need a sharper edge.

No shortcuts for testing

Always test your functions against a variety of scenarios before they strut their stuff on the production stage.