Explain Codes LogoExplain Codes Logo

Aggregate function in an SQL update query?

sql
update-statements
aggregate-functions
sql-server
Nikita BarsukovbyNikita Barsukov·Sep 25, 2024
TLDR

Let's get straight to it. Here's how you execute an UPDATE query with a subquery performing the aggregate function:

UPDATE table_name SET column1 = (SELECT AGGREGATE_FUNC(column2) FROM another_table WHERE condition) WHERE table_name.id = another_table.foreign_id;

Be sure AGGREGATE_FUNC is SUM, AVG, etc., table_name identifies your target table, column1 is the column you're updating, and condition is ready to filter your rows. Make the connection between the two tables with an appropriate identifier (id, foreign_id).

Incorporating aggregate into an update using JOIN

It's not uncommon to have to UPDATE records with an aggregate from a correlated subquery. For an efficient way to use an INNER JOIN to apply an aggregate result to your target fields, check out this pattern:

UPDATE t1 SET t1.column1 = t2.aggregated_result FROM table1 t1 INNER JOIN ( SELECT foreign_key, AGGREGATE_FUNC(column) as aggregated_result FROM another_table GROUP BY foreign_key -- Don't GROUP BY the dessert menu... get it? Sorry, programmer humor. 🙃 ) t2 ON t1.id = t2.foreign_key;

The subquery (dubbed t2 in this scenario) calculates the aggregate, which is then joined with table1 (t1). Confirm field names and join conditions to avoid any nasty surprises.

SQL Server versus PostgreSQL: Understanding the nuances

PostgreSQL plays by slightly different rules. Instead of using a JOIN in the FROM clause, you'll use a SELECT approach like so:

UPDATE table_name SET column1 = t2.aggregated_result FROM ( SELECT foreign_id, AGGREGATE_FUNC(column) as aggregated_result FROM another_table GROUP BY foreign_id ) t2 WHERE table_name.id = t2.foreign_id;

In this scenario, the FROM clause contains a SELECT subquery that does the hard work of calculating the aggregate before the update happens.

Boosting performance: Index your way to success

Dealing with large datasets? Don't forget to slap indexes onto the columns in your joins and WHERE conditions to give your performance a welcomed boost.

Data inconsistency checks: Better safe than sorry

Before running large updates, test your query on a sample dataset to ensure you've catered to any potential data conflicts or anomalies like null values.

SQL Server's secret weapon: CROSS APPLY update

For those in the SQL Server camp, CROSS APPLY is a powerful alternative for updating fields based on aggregated data:

UPDATE t1 SET t1.column1 = t2.aggregated_result FROM table1 t1 CROSS APPLY ( SELECT AGGREGATE_FUNC(column) as aggregated_result FROM another_table WHERE t1.id = foreign_id -- CROSS APPLY: For when INNER JOIN feels too mainstream. ) t2;

This approach lets the aggregated data get calculated for each row in table1.

Keep aggregated data in check: One row, one value

Ensure your subqueries deliver a single value per row to sidestep accidentally updating rows with skewed aggregate values.

A matter of clarity: Table aliases

Keep your tables straight with table aliases when updating tables based on aggregate data, removing guesswork when dealing with columns from multiple tables.

Syntax spot-check: Cross-reference with your SQL system

UPDATE statements can differ between systems like SQL Server and PostgreSQL. When in doubt, refer to your database system's documentation.