Explain Codes LogoExplain Codes Logo

Convert Python dict into a dataframe

python
dataframe
pandas
data-orientation
Nikita BarsukovbyNikita Barsukov·Sep 11, 2024
TLDR

Easily convert a Python dict into a DataFrame with pd.DataFrame.from_dict(dict). The dict's keys will automatically become columns and the values, assuming they are list-like, will form rows.

import pandas as pd # Example dictionary with lists as values data = {'A': [1, 2], 'B': [3, 4]} # Use pandas magic df = pd.DataFrame.from_dict(data)

The result is a 2x2 DataFrame with columns 'A' and 'B'. Easy, right?

Detailed walkthrough

Start with a simple dictionary

For dictionaries with list-like values, use pd.DataFrame.from_dict() as shown above.

Time is of the essence

If your dictionary keys are dates and the values are single scalars, it's better to have dates as an index. Here's how:

data = {'2021-01-01': 10, '2021-01-02': 15} # Convert to Series because it's cooler this way s = pd.Series(data) # Name that index right s.index.name = 'Date' # Convert to DataFrame and reset index (because why not?) df = s.reset_index(name='DateValue')

Now, Date and DateValue will clearly display your time-series data.

How about "date"-ing the right way?

If your dates are grumpy and don't want to follow datetime format, pd.to_datetime is your friend:

data['Date'] = pd.to_datetime(data['Date'])

This transforms your Date column into Timestamp objects. They're not only cool, but also useful for time series analysis.

Look, a single-value dictionary!

When every key-value pair is feeling lonesome (read: single scalar value), kindly provide an index:

data = {'A': 1, 'B': 2} # DataFrame magic with an index df = pd.DataFrame(list(data.items()), columns=['Key', 'Value']) # Or if you prefer rows over columns df = pd.DataFrame([data])

The first snippet gives you keys as column headers, while the second reveres rows over columns.

No space? No problem!

If you're dealing with chunky dictionaries or you're just trying to save memory space, consider using row-oriented external storage like Parquet or Feather:

df.to_parquet('data.parquet') df.to_feather('data.feather')

You can read these back into a DataFrame with pd.read_parquet() and pd.read_feather() respectively. Remember, tidy data is happy data.

Other interesting scenarios

Working with JSON

Creating a DataFrame from a list of dictionaries is as easy as pie. It's especially useful when your data is in JSON format:

data = [{'A': 1, 'B': 2}, {'A': 3, 'B': 4}] # Who knew it was this easy? df = pd.DataFrame(data)

In this case, a single list item corresponds to a row in the DataFrame. Neat, huh?

Mind your orientation

Some dictionaries are not like the others. This is where the orient parameter comes in handy:

# Example dict just made it out d = {'one': [1., 2., 3.], 'two': [4., 5., 6.]} # orient='index' turns keys into index, and voila, rows of values! df_index_oriented = pd.DataFrame.from_dict(d, orient='index') # orient='columns', on the other hand, does exactly the opposite. df_columns_oriented = pd.DataFrame.from_dict(d, orient='columns')

When you find your dictionary dancing to its beat, orient is your dance partner.