Explain Codes LogoExplain Codes Logo

How do I find numeric columns in Pandas?

python
pandas
dataframe
numeric-data
Anton ShumikhinbyAnton Shumikhin·Jan 4, 2025
TLDR

Find numeric columns with the built-in Pandas function select_dtypes:

numeric_cols = df.select_dtypes(include='number').columns

Here the output will be all column names of numeric type (Including integer, float, etc.). If you want to go deeper, try specifying data types like 'float64' or 'int64'. Let's see how:

float_cols = df.select_dtypes(include='float64').columns int_cols = df.select_dtypes(include='int64').columns

Whether you're dealing with a lightweight dataset or a massive data tsunami, select_dtypes can be your lifeguard, facilitating scalable data processing.

Hunting for golden numbers

The select_dtypes function brings us pure gold by retrieving numeric columns from a sea of data. It offers flexible and scalable solutions for separating wheat (or in our case, numbers) from the chaff.

# Indiana Jones style: Including treasure from multiple classes numeric_bool_cols = df.select_dtypes(include=['number','bool']).columns # Hercule Poirot style: Exclude all but the precious numeric non_numeric_cols = df.select_dtypes(exclude='number').columns

Guard against sneak attacks

While handling data, unintended type casting might sneak in and rob you of your numeric columns. To prevent this, check your prized possessions:

assert all(df[numeric_cols].apply(lambda x: pd.api.types.is_numeric_dtype(x))) # Putting an end to the identity thief’s shenanigans

This approach verifies that your code recognizes numeric columns even after heavy data processing.

The Unsung Hero underscore get numeric data

Despite select_dtypes being the star of the show, we also have an unsung hero._get_numeric_data() method. Although it's faster in some cases, it's generally safer to stick with the star rather than the understudy.

# Shhh! Secret treasure map hidden among the scripts all_numeric_df = df._get_numeric_data()

Remember, as this method is undocumented, it can be volatile and may change with future versions of Pandas.

Dive deeper into the sea of data types

Specialist searching

Pandas offers a wide array of numeric types. If you are a perfectionist looking for a specific kind of gem, you can do so by specifying data types:

# Detective on duty optimized_cols = df.select_dtypes(include=['int8','int16','float16']).columns

Explore beyond the regular treasure

What if you want to fetch datetime or timedelta columns along with the numeric? No worries, we have the perfect ghost ship for our treasure hunt:

numer_time_cols = df.select_dtypes(include=['number','datetime','timedelta']).columns

Beware of Imposters!

Be cautious with mixed-type columns. They might seem like palaces from outside but, inside, they hold both numeric and non-numeric values, which can hijack your analysis:

# Launch the "Imposter" detection mechanism imp_cols = df.applymap(lambda x: isinstance(x, (int, float, str))).any() suspect_cols = df.columns[imp_cols] # Spot potential "Among Us" imposters