Explain Codes LogoExplain Codes Logo

How can I verify if one list is a subset of another?

python
collections
data-structure
performance
Nikita BarsukovbyNikita Barsukov·Jan 22, 2025
TLDR

Need to check if list_a is a subset of list_b? Whip out set.issubset():

list_a = [1, 2, 3] list_b = [1, 2, 3, 4, 5] is_subset = set(list_a).issubset(list_b)

If every element in list_a is found in list_b, the result is True.

For those who love shortcuts, the <= set operator gets the job done just as well.

is_subset = set(list_a) <= set(list_b)

"Magic?" Nope. This is Python working efficiently behind the scenes – no manual looping, perfect for when order is of no importance, and duplicates are unwelcome guests.

When duplicates come to the party!

Now, what if your lists cannot do without duplicates? Try this on for size: using collections.Counter:

from collections import Counter list_a = [1, 2, 2] # Oh, hello there Mr. 2, twice in the same list! list_b = [1, 1, 2, 2, 3] # Look at 1 and 2, twinning so hard! is_subset = Counter(list_a) & Counter(list_b) == Counter(list_a) # ^ Party is rocking! Duplicates are allowed and everyone's having fun!

Titan-size datasets? Frozenset to the rescue!

When dealing with ginormous datasets, convert your lists to frozenset for that immutable and hashable static lookup goodness:

list_a = frozenset([1, 2, 3]) list_b = frozenset([1, 2, 3, 4, 5]) # Absolutely yuuuge, with a hint of frozenset! is_subset = list_a.issubset(list_b) # No worries about data changes midway. Frozenset has got you covered!

For static lookup tables: Optimize, or go home!

Static lookup tables? Here's a bright idea: pre-process that data into a set or frozenset for a speedier performance:

# Our unchanging lineup static_lineup = frozenset(['Rock', 'Pop', 'Jazz', 'Blues']) subgroup_lineup = frozenset(['Jazz', 'Bebop']) # Subset check in 3... 2... 1... is_subset = subgroup_lineup.issubset(static_lineup)

Subset check vs Subsequence search: Know the Difference!

For a subset check, your list's elements can chill anywhere they want – top, bottom or middle.

But a subsequence check? It's all about order. Elements stick together, like best friends:

# Subset: hangout mode. is_subset = set(list_a) <= set(list_b) # Subsequence: gravity applies! def is_subsequence(list_a, list_b): it = iter(list_a) # Iterator initiation! return all(item in it for item in list_b) # If every item in list_b appears in list_a, sequentially, it's a subsequence.

On your mark, get set!

In Python, the right data structure is key. Lists are great for ordered collections, but when set operations (like subset checks) enter the equation, the situation begs for sets or frozensets, courtesy of their O(1) membership tests.

Working with unique keys? Use dictionary key views for subset checks:

dict_a = {'one': 1, 'two': 2} dict_b = {'one': 1, 'two': 2, 'three': 3} is_subset = dict_a.keys() <= dict_b.keys()

In the Land of Large Datasets

Working with large, static lookup tables? Use the frozenset data structure for a performance boost:

# Scenario: Do all user actions match admin actions? admin_actions = frozenset(['edit', 'delete', 'ban']) user_actions = set(['edit', 'comment']) is_allowed = user_actions.issubset(admin_actions) # ^ The user has the 'edit' power, but can't 'delete' or 'ban'. Sounds fair!

Multiplicity matters: When you care about counts

Dealing with such as inventory systems where item counts are critical:

# Scenario: We've got an order to fulfill! warehouse = Counter(['widget', 'sprocket', 'widget']) order = Counter(['widget', 'widget']) can_fulfill = order & warehouse == order # ^ Two widgets coming right up from our overstocked warehouse!

Dictionary Key Views: A unique approach

When mapping resources, verify a set of services based on dictionary key views:

services = {'http': 80, 'https': 443, 'smtp': 25} required_services = {'http', 'smtp'} available = required_services <= services.keys() # ^ Checks if the required services are in our services dictionary.