Compute list difference
The quick answer for list difference is to use a list comprehension with not in
to maintain the order:
Or if the order does not matter, set subtraction will get the job done:
Both yield elements in list_a
not in list_b
.
No-loss computation
When the list order matters, convert list_b
to a set and execute a list comprehension:
This retains the order of elements from list_a
whilst making the most out of faster set lookup.
Retaining repeats
To keep duplicates, use a collections.Counter for complicated differences:
This method subtracts frequencies, preserving order and count of remaining items.
Advanced methods with difflib
In complex circumstances, where standard list operations are insufficient, use difflib.SequenceMatcher:
difflib provides not just differences, but also contextual changes between lists, ideal for non-standard diff computations.
Large scale performance
Remember time complexity with big lists: set operations are O(n), but with list comprehensions, they become O(n*m). This makes it inefficient for large datasets.
Try NumPy for dynamic data and for vectorized operations:
Was this article helpful?