Update or Insert (multiple rows and columns) from subquery in PostgreSQL
When it comes to PostgreSQL, you can upsert data using the INSERT and ON CONFLICT clauses. Here's the typical SQL pattern for upserting:
Tweak this template to fit your context:
- Substitute
target_table
with your actual table name. column_list
represents the list of columns involved in your insert or update operation.source_data
denotes the subquery or table providing the new data.conflict_columns
identify unique rows, often equating to the primary key.column_to_update
: The column to be updated in the event of a conflict.
Impressively, the DO UPDATE SET
part ensures existing rows receive new values. EXCLUDED
is used to point to the row that would have been inserted barring any duplication.
Deep dive into the upsert operation
The term "upsert" cleverly combines "update" and "insert". Essentially, it checks for the existence of a row. If found, it performs an update; if not, it inserts a new row. PostgreSQL carries out this dual functionality in a single SQL statement. Notably, the efficiency in this approach lies in condensing two potential operations into one, thereby saving resources and execution time.
Mastering complex subqueries
In handling subqueries that veer towards the complex end of the spectrum:
- Always use an alias for your subquery as a fail-proof way to make referencing it in
INSERT
statements simpler. - Double-check data types and column matches—mismatched data types between main table and subquery are a common source of errors.
- Be precise with your join conditions within the subquery to prevent undesirable data updates.
Using temporary tables to enhance performance
When dealing with intense data manipulation (large datasets or complex joins, for instance), resource usage can skyrocket. Creating a temporary table can give a much-needed boost to your upsert operation's performance. This temporary table should contain the outcome of your intricate operations, resulting in a smoother insert or update process:
Optimization tricks at your fingertips
Harnessing JOINs in subqueries
Make JOINs
your best friend when dealing with subqueries. They allow updates across multiple tables and collate data from different sources. Here, data reconciliation is taken care of within the subquery itself, which is more resource-saving than doing so post-insertion.
Streamlining INSERTs minus VALUES
When you're performing numerous INSERTions via subquery, you can ditch the VALUES
syntax for a SELECT subquery, resulting in a smoother:
Such an approach is far more straightforward and lean, especially when dealing with a large number of records.
For each truck:
- Check if it already has the SAME data box
- If yes: Update the box inside (🔄)
- If no: Load a new data box (➕)
**UPSERT visualised** (via `ON CONFLICT`):
```markdown
🚚🔄🚚➕🚚➕...
# Each truck gets updated with NEW data or receives ADDITIONAL data boxes
The key takeaway here is to efficiently handle these updates and insertions in bulk—ensure each box meets the right truck just once!
Making every update precise and efficient
The art of correct column referencing
When detailing columns in the SET
clause, it's pivotal to match them properly between the main table and the excluded (conflicted
) subquery. Incorrect cross-references can thwart the operation.
How the WHERE clause refines updates
In certain scenarios, not every conflict should automatically switch to an update. Use the WHERE
clause to filter these conditions and guarantee that only specific rows are updated:
The virtue of understanding transaction isolation
A solid grasp of PostgreSQL's transaction isolation levels can empower you to design your upsert operations to be both performant and reliable. Side bonus: it helps to minimize risks of deadlocks and anomalies.
Was this article helpful?