Explain Codes LogoExplain Codes Logo

Dropping infinite values from dataframes in pandas?

python
dataframe
pandas
data-cleaning
Nikita BarsukovbyNikita Barsukov·Jan 9, 2025
TLDR

Let's kick out those arithmetic outcasts, infinity and negative infinity, from your DataFrame in just one code line:

df = df.replace([np.inf, -np.inf], np.nan).dropna() # This line cracks down on infinities faster than the number police!

Pandas .replace() method swaps both types of infinities (np.inf and -np.inf) with the newbie NaN, who is then shown the exit with .dropna().

Want to temporarily treat infinities as NaN within a particular context? Then you'll need:

with pd.option_context('use_inf_as_na', True): df = df.dropna() # Infinites are undercover NaNs now!

If banishing infinities forever is your plan, use:

pd.set_option('use_inf_as_na', True) # Infinity? Never heard of it!

To keep only finite values across your DataFrame:

df = df[np.isfinite(df).all(1)] # All infinite values, pack your bags and leave!

If you're after infinities in certain columns:

df = df.replace({col: {np.inf: np.nan, -np.inf: np.nan} for col in ['col1', 'col2']}).dropna(subset=['col1', 'col2']) # Sorry infinities, you're not invited to the column party!

Data integrity is crucial. Make sure you're not sending any valuable data into the exile of dropped values unawares.

Silence Of The Infinities

Say you have some calculations in your DataFrame that result in positive or negative infinity. They’re not harmful per se, but they could skew your analysis or clutter up your data visualization.

Handling infinite with NaN

Choosing to handle these infinite values like they are NaN can help simplify your data operations. By converting these “infinites" into NaNs, you can easily identify and eventually remove them.

Scenario specifics

Focusing on a particular column to get rid of infinite values?

df['your_column'] = df['your_column'].replace([np.inf, -np.inf], np.nan) # Infinity's VIP pass to 'your_column' has been revoked!

Is there a column that should only retain finite values?

df = df[df['your_column'].apply(np.isfinite)] # Only 'your_column' has been cleaned. Great attention to detail there!

Got a dataframe that needs infinite replacing across multiple columns based on dtype?

df = df.apply(lambda x: x.replace([np.inf, -np.inf], np.nan) if x.dtype == 'float64' else x) # Float columns only, please. We don't serve infinities here!

On older versions of Pandas, you'll need use_inf_as_null instead of use_inf_as_na. Keep an eye on your versions!

Wrangling those infinite values

Sometimes non-numeric data types or masked values can trip up this method. Thus, it’s beneficial to conduct preliminary data exploration to understand what you’re dealing with. And remember to tailor the solution to your data.

Another approach is to create a cleaning function for your data, which handles infinities alongside other data cleaning steps.

Going off-script and writing your own function to identify and handle specific data issues will let you debug unexpected results in a jiffy! Consider integrating this within a try/except block for maximum efficiency.