Filter dataframe rows if value in column matches a set list of values
Key points:
@matches
accesses thematches
variable inside thequery()
function. But don't tell anyone, it's a secret.- Simplicity: Bypass the
.isin()
drama; keep it short, keep it clean!
When to use isin
— Oh no, not the set!
Use isin
when you exhibit symptoms of chronic scrolling syndrome caused by nested loops. It comes bearing direct row selection against a list of criteria.
Just remember, isin
is your dad's Extremely-Specific Set of Skills that will find and will filter your DataFrame rows.
Sniffing out patterns with regular expressions
Is it just me or there is a strong whiff of partial matches in the air?
Have a crack at str.contains
:
Words of Wisdom: Use case=False
for case-insensitive hunts and |
for hunting parties.
Logically filtering with operators
Feeling strained by control issues? Divide and conquer with logical operators '&
', '|
' and '~
' for 'and', 'or', 'not'!
Filtering efficiently in a numeric range? Yes, please!
Performant filtering for data the size of a mammoth
With datasets larger than the universes where 'col'
exists in one measly list, try to use .loc
with .isin
!
Pro tip: Use .loc
for feeling less memory poor and more speed rich!
Create masks — Put on your face pack, dear DataFrame!
Masks are deep-cleansing boolean facials that leave your DataFrame squeaky clean.
Skincare Guru Tip: Variable masks for complex filtering leaves DataFrames radiant and code squeaky clean!
Merging (AKA Dealing with Big Data)
Got dynasty-sized datasets? Merge like it's a reality TV show:
Was this article helpful?