Explain Codes LogoExplain Codes Logo

Bulk insert, update if on conflict (bulk upsert) on Postgres

sql
bulk-insert
upsert
postgresql
Nikita BarsukovbyNikita Barsukov·Jan 12, 2025
TLDR

Bulk upsert boils down to using the INSERT ... ON CONFLICT ... DO UPDATE command in Postgres. It specifically performs an insert operation, but swings into an update when a duplicate key comes swaggering down the aisle.

Here's an example for table my_table with unique constraint on id:

# Here we go, party is on! INSERT INTO my_table (id, column_a, column_b) VALUES (1, 'Aussie', 'Beaver'), # Meet Aussie the beaver (2, 'Cheeky', 'Dog') # And his friend, Cheeky the dog # Ah! Duplicate keys! But we're cool. ON CONFLICT (id) DO UPDATE SET column_a = EXCLUDED.column_a, # Aussie becomes Cheeky? Plot twist! column_b = EXCLUDED.column_b; # Beaver is a dog now? Much wow!

In this saga, the EXCLUDED table valiantly holds the contender row which otherwise would have been an insert, thus column updates echo the intended influx.

Harnessing UPSERT for bulk actions

Bulk operations in PostgreSQL may seem like unruly witches, but the ON CONFLICT clause works like a talisman, enhancing the upsert spell.

Dealing with multiple values per column

Put unnest function in your magic toolkit when the task is to manage multiple values for each column. This little charm transmutes arrays into rows, thus paving the way for bulk insert operations.

# Unroll the package! INSERT INTO my_table (id, column_a, column_b) SELECT * FROM unnest( ARRAY[1, 2], ARRAY['Whiskers', 'Fuzzball'], ARRAY['Cat', 'Dog'] ) # Bump into a duplicate? No worries ON CONFLICT (id) DO UPDATE SET column_a = EXCLUDED.column_a, column_b = EXCLUDED.column_b;

Cast the perfect spell with conflict targets

In a mystical land where DSAs meet databases with multiple unique constraints, you can selectively choose your battle grounds to determine the conflict resolution stratagem.

# Brace for impact INSERT INTO my_table (id, column_a, column_b) VALUES ... # Magic happens when you know where to hit ON CONFLICT (id) DO UPDATE SET ... WHERE my_table.column_a IS DISTINCT FROM EXCLUDED.column_a; # Yeah, handle it like teen drama!

Partial updates: latching onto nuances

Sometimes, you want to preserve certain columns as sacred relics. Insert a conditional clause in ON CONFLICT to perform partial updates.

# Don’t like full transformations? ON CONFLICT (id) DO UPDATE SET column_a = EXCLUDED.column_a # Keep calm, and update selectively WHERE my_table.column_a IS DISTINCT FROM EXCLUDED.column_a;

Through this ordeal of conflict targets, and conditional updates, you wield the power to ensure a precise and swift handling of bulk upserts.

Lay the right groundwork with indexes

Effective bulk upserts are a product of meticulous indexing. Choose the right primary keys or unique constraints to prevent “Where’s Waldo?” scenarios during conflict resolution.

Brace for impact: performance knick-knacks

A quick heads-up: upsert operations can get hefty. Especially with mountains of data, it's wise to portion your upserts, unless you fancy a trip to “performance hibernation”.

Walking on thin ice: concurrent transactions

Be mindful of unseen monsters like row locking during upsert. PostgreSQL handles this out-of-the-box, but the deadlocks ghost can make occasional surprise visits in a concurrent transaction milieu.

Pitfall alert! And here are the solutions

Crossed paths with update errors? One common misstep is not enlisting all columns in EXCLUDED. Ensure your updates echo your intentions by calling EXCLUDED values with a roll call.

By following these tips and anticipating common pitfalls, you can optimize your upserts for speed and reliability.