Select NOT IN multiple columns
Filtering out rows with a set of values in multiple columns? The NOT EXISTS
function coupled with a subquery can elegantly deal with this. The script below illustrates an efficient method of excluding certain values from your dataset in a table called your_table
:
Here the subquery creates a derived table (exc
) and checks that the values in col1
or col2
of your_table
do not match those in exc
.
Alternatives to consider and potential pitfalls
For all the SQL aficionados out there willing to explore different approaches, let's discuss alternative techniques and some cautionary notes:
- Concatenating columns: Consider the cocktail of concatenating columns for situations of multi-column
NOT IN
. However, just like good cocktails are mixed with caution, so should this technique be used. High on performance cost and prone to false negatives if your delimiter naturally exists in the data.
- Using LEFT JOIN to dodge NULL: The good old
LEFT JOIN
can serve as a great method to dodge NULL bullets when usingNOT IN
. NULL comparisons in SQL can lead to unexpected results. A NULLNOT IN
anything is FALSE!
- Performance Optimization: Let's talk about speed!
NOT EXISTS
andLEFT JOIN
get the medal here with more efficient query execution plans, plus they areindex
friendly!
Picking your technique
Let's summarize and help you decide the optimal technique, considering the ongoing situation:
- Small dataset?
NOT IN
with concatenated columns could be the easy way out. - Large dataset? Performance affected? Opt for
NOT EXISTS
orLEFT JOIN
. - Worrying about NULLs? Use
LEFT JOIN
orNOT EXISTS
. - Want readable code over slight performance gains? Choose the method that best expresses your intent in code.
Remember, always make an informed decision! Look at the specifics of your use case, measure against real-world scenarios, and select the method that provides maximum efficiency and lower maintenance costs.
Was this article helpful?