Explain Codes LogoExplain Codes Logo

Best way to find the intersection of multiple sets?

python
set-operations
reduce
code-readability
Nikita BarsukovbyNikita Barsukov·Aug 5, 2024
TLDR

For swiftly finding elements common to multiple sets, leverage the intersection() method or the & operator in Python.

If we call intersection():

intersection = {1, 2, 3}.intersection({2, 3, 4}, {3, 4, 5}) print(intersection) # Ouputs {3} as a result. No surprise there!

Or you can use & operator:

intersection = {1, 2, 3} & {2, 3, 4} & {3, 4, 5} print(intersection) # Again outputs {3}. Coincidence? I think not!

Either way you get the common element {3}. Win-win!

Handling empty lists like a pro

If you're dealing with dynamic set lists, use if-else check so your code doesn't bonk when the list is empty:

sets_list = [{1, 2}, {2, 3}, {3, 4}] intersection = set.intersection(*sets_list) if sets_list else set() print(intersection) # Outputs {2, 3} or set(). Like a boss!😎

Unleashing the power of reduce

Feel like flexing? Combine functools.reduce with set intersections:

from functools import reduce sets_list = [{1, 2, 3}, {2, 3, 4}, {3, 4, 5}] intersection = reduce(lambda a, b: a & b, sets_list) print(intersection) # SLAM DUNK! {3} it is!

But remember: just because you can, doesn't mean you should. Even Guido van Rossum prefers explicit for-loops for clarity's sake.

When readability is the real MVP

To enhance code readability, stick to an explicit for-loop:

sets_list = [{1, 2, 3}, {2, 3, 4}, {3, 4, 5}] intersection = sets_list[0] for s in sets_list[1:]: intersection &= s print(intersection) # Look, mommy, no reduce! Still gives {3} though!

Seeing is believing! And explicit loops are easy on the eyes.

Edge cases: What might trip you up?

Empty set gotchas

When you intersect no sets or intersect with an empty set, beware of spooky undefined behaviour:

# No sets to intersect print(set.intersection()) # Raises TypeError. Spooky, isn't it? # Intersecting with an empty set print({1, 2, 3}.intersection(set())) # Outputs set(). Ghosted!

Prepare your code for these edge cases and sleep better at night.

Intersecting chains of sets optimally

When huge chains of sets need intersecting, reduce overhead:

# Godzilla-sized datasets fear the might of... sorting? sets_list = [large_set, medium_set, small_set] sorted_sets = sorted(sets_list, key=len) intersection = set.intersection(*sorted_sets)

Key to success? Always intersect smaller sets first. Efficiency wins!

Choosing set operations wisely

Wrapping your head around when to use intersection and when to use other set operations :

# Unite we iterate, with union (|) print({1, 2} | {2, 3} | {3, 4}) # Outputs {1, 2, 3, 4} # Difference (-) strikes back print({1, 2, 3} - {2, 3, 4} - {3, 4, 5}) # Outputs {1}

Tip? Choose right set operation, young Jedi.