Explain Codes LogoExplain Codes Logo

How can I do an UPDATE statement with JOIN in SQL Server?

sql
update
join
performance
Alex KataevbyAlex Kataev·Aug 18, 2024
TLDR

Embrace the sweet symphony of SQL with the UPDATE and JOIN tune:

UPDATE t1 SET t1.column = t2.column FROM Table1 t1 JOIN Table2 t2 ON t1.id = t2.foreignKey WHERE t1.condition;

Align the data Table1 and Table2 based on their relational key, add helpful aliases (t1, t2) for better readability, and define the WHERE clause to filter the rows you're updating. This rhythm works with any JOIN type.

Understanding UPDATEs with JOINs

Let's break down the pieces. The backbone of SQL lies in its capability to connect data across tables. A common approach for modifying rows in one table based on another is JOINing them in blissful harmony. SQL Server’s syntax plays well for this.

JOIN Types in SQL Server

Variations of the JOIN solo can refine your UPDATE operations:

  • INNER JOIN: Updates when there's a matching record in both tables.
  • LEFT JOIN: ALSO updates non-matching rows from the left table with either NULL or default values.
  • RIGHT JOIN: A LEFT JOIN but switch the roles of the left and right tables.
  • FULL JOIN: Merges LEFT and RIGHT like a grand symphony, updating all needed rows from both tables.

Why Alias Tables?

Thoroughly alias your tables (a, b instead of table1, table2). It's easier on the eyes, less confusing, a real lifesaver when dealing with more tables or verbose table names.

The Role of the WHERE Clause

Your WHERE clause is your conductor, leading the UPDATE orchestra to hit the right rows. A clear and structured WHERE clause prevents those off-key notes that could corrupt your performance... I mean... data.

Locking Mechanism

When the audience (dataset) is big, be wary of the locking behavior. SQL Server tends to lock rows, pages, or sometimes even entire tables. This can affect the other performances (queries) happening concurrently. Test your UPDATE statements thoroughly and if necessary, consider encores (breaking your UPDATEs into several smaller transactions).

When Subqueries Strike the Right Note

There are moments where instead of a grand ensemble, a solo performance hits the right note - this is when a subquery might be more suitable. For example, you need to update with an aggregate value like SUM or MAX. Cue the subquery:

UPDATE t1 SET t1.aggregateValue = (SELECT MAX(t2.value) FROM Table2 t2 WHERE t2.foreignKey = t1.id) FROM Table1 t1 WHERE t1.condition;

Advanced Scenarios: Nulls, Duplicates, and Performance

Handling Nullable Columns

Remember, NULLs can crash the party. If t2.column can have NULLs and you want to preserve existing data in t1.column, consider an IS NULL check:

UPDATE t1 SET t1.column = ISNULL(t2.column, t1.column) FROM Table1 t1 JOIN Table2 t2 ON t1.id = t2.foreignKey WHERE t1.condition;

Dealing with Duplicate Rows

What if Table2 has a band of duplicate keys? You could possibly update Table1 with arbitrary sama data. A sub-select can help you pick the right tune:

UPDATE t1 SET t1.column = ( SELECT TOP 1 t2.column FROM Table2 t2 WHERE t2.foreignKey = t1.id ORDER BY t2.criteria DESC ) FROM Table1 t1 WHERE t1.condition;

Optimizing for Performance

Got a big concert with a large audience (data)? Indexing the columns involved in the JOIN and WHERE clauses is the opening act that smooths the main performance. It speeds your queries dramatically.

Cross-Database Updates

Going international with cross-database updates? You require the appropriate visa (read: permissions) and use the full passport details (fully qualified table names):

UPDATE db1.dbo.Table1 SET column = db2.dbo.Table2.column FROM db1.dbo.Table1 t1 JOIN db2.dbo.Table2 t2 ON t1.id = t2.foreignKey WHERE t1.condition;