Explain Codes LogoExplain Codes Logo

How do I find records that are not joined?

sql
join
best-practices
performance
Anton ShumikhinbyAnton Shumikhin·Oct 29, 2024
TLDR

To spotlight records in tableA that don't get any love from tableB, harness the power of LEFT JOIN and IS NULL:

SELECT a.* FROM tableA a LEFT JOIN tableB b ON a.id = b.id WHERE b.id IS NULL; -- "Sorry b.id, it's not you, it's NULL."

These lonely tableA records get their moment, standing alone like a popstar in a spotlight–with b.id IS NULL.

Unraveling LEFT JOIN and IS NULL

The belle of our answer's ball, LEFT JOIN woven with IS NULL, needs a little demystification. LEFT JOIN brings together all records from the left table, i.e., tableA and matched records from tableB. For every unpaired tableA record, it returns null for every field of tableB.

Our bouncer, the WHERE clause, filters out all b.id IS NULL, unveiling the tableA records that are single and ready to mingle.

If your join routine has multiple columns or complex expressions, you might want to amp up the inspection by checking every related tableB column against null.

Exploring other party venues

It's a big SQL world out there, and we have other ways to find our isolated tableA records.

Dance with NOT IN

An alternate strategy pirates the charm of NOT IN with subquery:

SELECT * FROM tableA WHERE id NOT IN (SELECT id FROM tableB); -- "tableA, we're sorry but your id's name is NOT IN the list."

This reels in tableA records without a match in tableB. Watch out for NULL values in tableB however, as they could throw a spanner in your NOT IN works.

The charm of EXCEPT

Our SQL deck also has an EXCEPT operator card, returning rows from one query that are missing from another:

SELECT id FROM tableA EXCEPT SELECT id FROM tableB; -- "EXCEPTionally tableA, tableB denied your id entry."

EXCEPT pulls in records from tableA uninvited in tableB.

Optimizing your performance moves

When grooving with sizeable datasets, you'd want to beware of the performance grim. Indexing the join columns or switching your NOT IN move to NOT EXISTS can work wonders:

SELECT a.* FROM tableA a WHERE NOT EXISTS ( SELECT 1 FROM tableB b WHERE a.id = b.id ); -- "Oh tableA, EXISTentially crisis'd much? tableB got your id ghosting."

With NOT EXISTS, the check stops at the first non-match, saving a whole lot of boogie energy.

Conquering integrity on the dance floor

Avoiding Null missteps

Your LEFT JOIN fields need to be dancer-friendly, or nullable. If tableA insists on not-nullable foreign keys to tableB, LEFT JOIN paired with IS NULL will cease to return those records.

Harmonizing the Data type beats

Keep the data type beats steady and consistent. Implicit type conversions that occur due to mismatched join condition data types can be performance killers.

Dancing on Indexes

Invest in a good dance floor, or indexes on your joined columns. This can groove down your query time significantly, while handling large datasets.