Explain Codes LogoExplain Codes Logo

How to check whether a pandas DataFrame is empty?

python
pandas
dataframe
best-practices
Nikita BarsukovbyNikita Barsukov·Nov 12, 2024
TLDR

The fastest way to check if a Pandas DataFrame is void of data is by utilizing the df.empty attribute:

is_empty = df.empty # Yields True if your DataFrame is as empty as my bank account

As another method, try evaluating the DataFrame's shape:

is_empty = (df.shape == (0, 0)) # Ditto, True for an empty DataFrame

These approaches offer quick checks for examining a DataFrame's contents, yet they only touch the surface.

Digging deeper: Handling diverse DataFrame scenarios

Are all values zero, nan, or none?

Here's a twist, a DataFrame may appear to contain elements, but all its values are NaN or None. So is it truly empty?

is_really_empty = (df.dropna(how='all').shape == (0, 0)) # True if DataFrame is a nan/none desert

Check your type before you step into the DataFrame bar

Before you go checking for emptiness, make sure your prom date, df, is a pandas.DataFrame and not a costly None:

is_dataframe = isinstance(df, pd.DataFrame) and not df is None # Asserts df isn't a party pooper!

Time and Memory: Our greatest nemeses

The story is little different with large DataFrames. The operations len(df.index) and len(df.columns) do not consume much memory:

is_empty = len(df.index) == 0 or len(df.columns) == 0 # Efficiency rocks!

Counting sheep... and meaningful data

A holistic check involves scanning all entries in the DataFrame, disregarding NaN values:

has_data = df.count().sum() > 0 # True if DataFrame has something worth calling "data"

This methodology ensures there's something valid within the DataFrame's borders.

Exploring exceptions when DataFrame is empty-like

Don't get fooled by pretentious null values

Where your DataFrame might contain only NaN or None values in rows or columns, bring out this heavy artillery:

is_truly_empty = df.isnull().all().all() # Wipes off the DataFrame's makeup

Silent zeros may not qualify as data

DataFrames may have a pre-assigned size but might not contain actual data, just an array of pesky zeros:

is_filled_with_nothings = df.fillna(0).astype(bool).any().any() # The Zero Crusader!

Now are we just sitting ducks?

Make it talk, print results

Why not let the checks speak out their results?

print("DataFrame is empty just like the void in Voldemort's heart" if df.empty else "DataFrame is brimming with data... or is it?")

This adds audible delight, especially during initial stages of debugging or log outputs.

Decision points in code workflow

A DataFrame being empty or non-empty may dictate consequent steps:

if df.empty: pass # action when you find that treasure chest is actually empty else: pass # action when you find that the DataFrame isn't a dud after all!

Incorporate this structure for seamless integration into your code.