Convert floats to ints in Pandas?

python

dataframe

pandas

data-integrity

byNikita Barsukov·Oct 1, 2024

To convert a float to an int in Pandas, use the .astype() function with the argument 'int' or 'Int64'. This can be done either for single columns or the entire DataFrame.

Consider the following examples:

df['col'] = df['col'].astype(int)  # Single column
df = df.astype(int)                # Entire DataFrame

Handling missing values and rounding

When dealing with floating-point numbers, rounding and missing value handling are crucial. For NaN values, use fillna(0.0) before the type conversion. However, to ensure your floating-point data isn't truncated arbitrarily, employ the round() function for precision:

# Captain NaN, the stealthy data pest, beware!
df['col'] = df['col'].fillna(0.0).round().astype('Int64')
df = df.fillna(0.0).round().astype('Int64')  # One round to conquer NaNs! 👊

Bulk conversion: The mighty blow

When the DataFrame grows, we need more robust and efficient ways. Use applymap(np.int64) for better precision in mass conversions, and select_dtypes(include=['float64']) to filter and convert float columns:

# Gather the float squad
float_cols = df.select_dtypes(include=['float64']).columns
# Train the float squad to be ints
df[float_cols] = df[float_cols].applymap(np.int64)

Change display: The master of disguise

But what if you want to keep the float data, but display them as integers to the human eye? Use options.display.float_format:

# "I'm not a float, I swear!" 🕵️‍♂️
pd.options.display.float_format = '{:,.0f}'.format

Controlling the integer type: The control freak

Sometimes your data needs certain integer types due to the size or sign constraints. For this, you can use specific integer aliases like np.int8, np.int16, np.int32, or np.int64:

# "I don't want just any integer. I want YOU, int32!" 😍
df['col'] = df['col'].astype(np.int32)

Handle with care: Data integrity

Converting floats to ints could lead to information loss, akin to losing your luggage at the airport – not so fun! Be cautious of data integrity before converting.

Data import: Type specification

When importing data, specify the dtype directly using dtype='Int64'. It’s like labeling your luggage – you know what you packed!

# "Welcome aboard, Int64 travelers!" 🛫
df = pd.read_csv('file.csv', dtype={'col': 'Int64'})

Post-conversion check: Count the luggage

Receipt check! Did you get all your luggage intact? Use df.dtypes to check:

# "Yes, my data is all here!" – Relieved Data Scientist 🧑‍🔬
print(df.dtypes)

Best performance: Master conversion

Use vectorized operations like applymap() for enhanced performance and efficiency. Don’t forget to reassign the converted columns back to the DataFrame:

# "Applymap, more like a magic map!" – Potentially Confused Data Scientist 
df[float_cols] = df[float_cols].applymap(lambda x: int(round(x)))

explain-codes / Python / Convert floats to ints in Pandas?

Linked

Convert Pandas column containing NaNs to dtype `int`



How to add an empty column to a dataframe?



Change column type in pandas



Pandas DataFrame: replace nan values with average of columns



How to drop rows of Pandas DataFrame whose value in a certain column is NaN



Pandas GroupBy Columns with NaN (missing) Values



Get list from pandas dataframe column or row?



Handling missing values and rounding Bulk conversion: The mighty blow Change display: The master of disguise Controlling the integer type: The control freak Handle with care: Data integrity Data import: Type specification Post-conversion check: Count the luggage Best performance: Master conversion