Select NOT IN multiple columns
Filtering out rows with a set of values in multiple columns? The NOT EXISTS function coupled with a subquery can elegantly deal with this. The script below illustrates an efficient method of excluding certain values from your dataset in a table called your_table:
Here the subquery creates a derived table (exc) and checks that the values in col1 or col2 of your_table do not match those in exc.
Alternatives to consider and potential pitfalls
For all the SQL aficionados out there willing to explore different approaches, let's discuss alternative techniques and some cautionary notes:
- Concatenating columns: Consider the cocktail of concatenating columns for situations of multi-column
NOT IN. However, just like good cocktails are mixed with caution, so should this technique be used. High on performance cost and prone to false negatives if your delimiter naturally exists in the data.
- Using LEFT JOIN to dodge NULL: The good old
LEFT JOINcan serve as a great method to dodge NULL bullets when usingNOT IN. NULL comparisons in SQL can lead to unexpected results. A NULLNOT INanything is FALSE!
- Performance Optimization: Let's talk about speed!
NOT EXISTSandLEFT JOINget the medal here with more efficient query execution plans, plus they areindexfriendly!
Picking your technique
Let's summarize and help you decide the optimal technique, considering the ongoing situation:
- Small dataset?
NOT INwith concatenated columns could be the easy way out. - Large dataset? Performance affected? Opt for
NOT EXISTSorLEFT JOIN. - Worrying about NULLs? Use
LEFT JOINorNOT EXISTS. - Want readable code over slight performance gains? Choose the method that best expresses your intent in code.
Remember, always make an informed decision! Look at the specifics of your use case, measure against real-world scenarios, and select the method that provides maximum efficiency and lower maintenance costs.
Was this article helpful?