Explain Codes LogoExplain Codes Logo

Writing a pandas DataFrame to CSV file

python
pandas
dataframe
csv
Anton ShumikhinbyAnton Shumikhin·Jan 5, 2025
TLDR

to_csv converts a Pandas DataFrame into a CSV:

df.to_csv('filename.csv', index=False)

The result is df saved as filename.csv without row indexes for tidiness.

For tab-separated values, use sep='\t':

df.to_csv('filename.tsv', sep='\t', index=False)

When handling unicode characters, remember to specify encoding='utf-8':

df.to_csv('filename.csv', encoding='utf-8', index=False)

File paths: A love story

Before performing DataFrame export, double-check your file path. Undecided about your working directory? Let's clear the fog:

import os current_directory = os.getcwd() # returns path of your current 'pythonic' dwelling print("Working directory:", current_directory)

In specifying file paths, remember to escape escape characters (using r'path\to\file.csv' or 'path\\to\\file.csv' on Windows).

Smoothing out encoding wrinkles

Basic to_csv works well, but there come times of trials when special treatment is needed. For instance,

  • A DataFrame with mixed data types might need column-wise encoding. In Python, there's always a for that:
for column in df.columns: # magic happens here: encodes and decodes every entry, 'ignoring' uncooperative ones df[column] = df[column].astype(str).apply(lambda x: x.encode('utf-8', 'ignore').decode('utf-8', 'ignore')) df.to_csv('filename.csv', index=False) # proceed to export peacefully
  • Sometimes, encoding exceptions occur. Good use of try-except can become your magic charm:
try: df.to_csv('filename.csv', index=False) # standard procedure except UnicodeEncodeError as e: # unforeseen circumstances print("Error during encoding:", e) # In case of fire, break glass: switch to 'ignore' encoding issues df.to_csv('filename.csv', encoding='utf-8', errors='ignore', index=False)

Customizing the CSV export

Need more control over the CSV output? Panda's to_csv offers a cornucopia of flexibility:

Taming quotes with quoting

from csv import QUOTE_MINIMAL df.to_csv('filename.csv', quoting=QUOTE_MINIMAL, index=False) # sentences need punctuation, but CSVs don't!

Playing hide and seek with the DataFrame columns

df.to_csv('filename.csv', columns=['Column1', 'Column2'], index=False) # I see you Column1 and Column2!

Munching large DataFrames in manageable bites

chunk_size = 1000 for i in range(0, len(df), chunk_size): df.iloc[i:i+chunk_size].to_csv(f'filename_{i}.csv', index=False) # Cookie Monster loves chunks!

Pandas flexibility lets you fine-tune CSV output to suit your dataset and use-case needs.