How to iterate over columns of a pandas dataframe

python

dataframe

performance

vectorized-operations

byNikita Barsukov·Jan 26, 2025

To iterate over Pandas DataFrame columns, use the df.iteritems() method:

import pandas as pd

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

for label, content in df.iteritems():
    print(f'{label}:')
    print(content)

Here, each loop gives you the column name and associated Series.

Turning knobs: Performance and flexibility with `df.items()` and `df.apply()`

df.iteritems() is fine, but both df.items() and df.apply() can offer better performance and more flexibility. So, let's turn some knobs:

for label, content in df.items():
    # label and content at your service
    # Got speed? Let's roll!

Or even better, apply a function on each column without explicitly iterating:

result = df.apply(some_cool_function)
# Just applied some function on all columns. It feels like magic!

Working with regression? No problem, use df.apply() to get residuals. Or spin a wheel with traditional loops:

df.apply(lambda col: sm.OLS(target, col).fit().resid)
# Running regressions faster than Usain Bolt!

# Or old school:
for column in df.columns:
    model = sm.OLS(target, df[column])
    results = model.fit()
    df['residual_' + column] = results.resid
    # Residuals stored. Who's got time for residuals inspection?

Slicing more your style? Use df.columns:

sub_df = df[df.columns[1:3]]
# Slicing dataframe like a hot knife through butter!

Dislike disorganized column manipulation? enumerate to rescue!

for i, column in enumerate(df.columns):
    # play_with(i, column)
    # Oh, I see, you like things organized! OCD much?

And yes, we refrain from deprecated 'ix'. We're fans of .loc or .iloc.

df.loc[:,'A']  # using labels for row and column
df.iloc[0,:]   # using index for row and column
# 'ix' who? Never heard of him!

Need to treat columns like rows? df.transpose() is all you need.

for row_label, row_value in df.transpose().iterrows():
    # row_label is a column in disguise!

Of course, add error checks because we are not savages!

Techniques for the initiated

Working with large dataframes

With large dataframes, watch the memory footprint:

Generator expressions can help with memory efficiency.
Vectorized operations are your friends!

The Art of slicing

To improve your slicing game with df.columns:

Negative indexing to skip the last column(s)
Conditional slicing to filter columns

The need for speed in regression

Running regressions? Here's how to win the race:

Pre-allocation to store regression results
Multiprocessing for parallel computation

explain-codes / Python / How to iterate over columns of a pandas dataframe

Linked

Update a dataframe in pandas while iterating row by row



Whether to use apply vs transform on a group object, to subtract two columns and get mean



Return multiple columns from pandas apply()



Pandas convert dataframe to array of tuples



Get a list from Pandas DataFrame column headers



Split a Pandas column of lists into multiple columns



Get column index from column name in python pandas



Turning knobs: Performance and flexibility with and Techniques for the initiated