Explain Codes LogoExplain Codes Logo

How to delete from select in MySQL?

sql
performance
best-practices
join
Nikita BarsukovbyNikita Barsukov·Oct 24, 2024
TLDR

No time to spare? Here's how to delete rows given a subquery condition:

DELETE FROM table1 WHERE id IN (SELECT id FROM table2 WHERE condition_column = 'value');

This banishes rows from table1 based on the table2 criteria. Try to be specific with your conditions and ensure the subquery columns match.

For duplicate armageddon using JOINs (the speedy way), behold:

DELETE t1 FROM table1 t1 JOIN ( SELECT id FROM table1 GROUP BY id HAVING COUNT(id) > 1 ) dup ON t1.id = dup.id;

Duplicates, prepare for deletion! 🦸‍♂️

Performance and duplicate handling

Duplicates like to play hide and seek. Tackle those copycats with the following:

DELETE t1 FROM table1 t1 INNER JOIN ( SELECT MIN(id) as keepId, name FROM table1 GROUP BY name ) t2 ON t1.name = t2.name AND t1.id != t2.keepId;

Ironically, we keep the oldest (minimum ID) of each name, and purge the rest. The keyword here is performance — avoid self-referential subqueries, they tend to be of the sloth variety. 🐌

Working with Moby Dick-like datasets? Migrate to a two-step approach:

  1. Collect IDs for deletion with a SELECT query.
  2. Unleash the DELETE operation with your gathered IDs.

Less mess, more clarity — and your future self will thank you when revisiting your code.

Safety first: Trial run your deletion

Before going all-in with your delete, preview what you're about to nuke:

SELECT * FROM table1 t1 WHERE EXISTS ( SELECT 1 FROM table1 t2 WHERE t1.id = t2.id HAVING COUNT(t2.id) > 1 );

This SELECT statement pinpoints the soon-to-be-zapped rows without doing the actual zapping. Satisfied? Swap SELECT * with DELETE.

Advanced usage and precautions

Using joins and aliases

Complex databases bring complex DELETE operations. Stay ahead with aliases:

DELETE t1 FROM table1 t1 JOIN table2 t2 ON t1.foreignKey = t2.id WHERE t2.condition = 'value';

Aliases like t1 and t2 make your code cleaner than a whistle. They clear any ambiguities when your joined tables share column names.

Use the power of subqueries

Nested subqueries with GROUP BY help you isolate rows with non-unique IDs:

DELETE FROM table1 WHERE id IN ( SELECT id FROM ( SELECT id FROM table1 GROUP BY id HAVING COUNT(id) > 1 ) AS WTF /* "WHERE's The Fruit", obviously */ );

Aliasing the subquery frees your code from self-reference errors.

Don't trip on the cascading staircase

Be mindful of cascading deletes: the butterfly effect of foreign key constraints with ON DELETE CASCADE.

Use a trial subset or employ transactions for safe rollbacks:

START TRANSACTION; -- Your DELETE operation ROLLBACK; -- If happy, replace with `COMMIT;`, if not, remain friends with your data!