Explain Codes LogoExplain Codes Logo

Combining two Series into a DataFrame in pandas

python
dataframe
pandas
concatenation
Nikita BarsukovbyNikita Barsukov·Feb 2, 2025
TLDR

In a rush? Here’s how you merge two pandas Series into a DataFrame. Either pop them into a dictionary with corresponding column names, or duct tape them with pd.concat for a vertical concatenation (axis=1). Blink twice if you got it:

import pandas as pd # You got two Series: series1 and series2 # Magical trick #1: DataFrame via Dictionary df = pd.DataFrame({'column1': series1, 'column2': series2}) # Magical trick #2: DataFrame via pd.concat df = pd.concat([series1, series2], axis=1, keys=['column1', 'column2'])

Ta-da! DataFrame magic. 'series1' becomes 'column1' and 'series2' morphs into 'column2'.

Tactics of indices and column names

Diving into the risky business of combining Series into a DataFrame, safeguarding the original indices is a must—no losing data on our watch. This is where the nimble footwork of pd.concat comes in handy, smoothly aligning non-consecutive indices without any casualties or missing data. The stake of the game is the axis parameter that needs to be handled with care—an axis of 0 concatenates along the index (the 'rows'), whereas axis as 1 performs a skilful flip along the columns.

What’s that, afraid of losing column names? Fear not, for the Series' name attributes have got you covered. They’ll fill in as column names in the DataFrame by default. If your Series' names draw a blank, you can assign using keys parameter inside pd.concat.

# No need for an identity crisis - Series names transform into DataFrame's column names series1.name, series2.name = 'column1', 'column2' df = pd.concat([series1, series2], axis=1)

What's more, you can add more than two Series to this group dance. pd.concat can handle an entire conga line of multiple Series. A word to the wise: mismatching Series' indexes and pd.concat will fill in the missteps with NaNs.

Index preservation and reincarnation

For an encore, to reincarnate the original index as an additional column, use reset_index as a magic wand right after the pd.concat show.

# The original index is reborn as a column df = pd.concat([series1, series2], axis=1).reset_index()

For pandas in flashback mode

For those living in a time bubble using pandas version below 0.23, kick start the engine by converting each Series to a DataFrame with to_frame(), then join them for a get-together.

# Convert Series into DataFrames and let them join the party. Works with pandas v<0.23. df1, df2 = series1.to_frame('column1'), series2.to_frame('column2') df = df1.join(df2) # default join is 'outer'; party open to all!

Sound the bugles, though! Always ensure the Series are on the same page with matching indexes when it comes to join.

Direct DataFrame birth with no concatenation business

For those who like their coffee black - eliminate concatenation and directly spawn a DataFrame using a dictionary of Series.

# Direct DataFrame creation from a family of Series df = pd.DataFrame({'column1': series1, 'column2': series2})

Under this approach, pandas ensures data alignment based on the indices of the Series. You've successfully dodged any data misalignment.

Hodling onto duplicates with non-consecutive indices

Put on your detective hat and check for potential duplicate suspects when combining Series with non-consecutive indices. These culprits can silently sneak into your resulting DataFrame. In case the alarm bells ring, unleash the guardian angel drop_duplicates to cleanse the DataFrame.

Confirming the result

Fresh out the oven, it’s always a good idea to check the first five rows of DataFrame using df.head(). This simple litmus test confirms that the DataFrame is behaving as expected, and there are no element of surprise or rebel rows disrupting the DataFrame structure.

Handy tips for success

  • Are your columns having an identity crisis? Double-check your column name attribution when converting from Series.
  • For a name change, use .rename(columns={'old_name': 'new_name'}) to rename DataFrame columns on the fly.
  • Comprehend how pandas auto-aligns data when using a dictionary of Series to create a DataFrame.

Walking the plank: edge cases and pitfalls

  • Walk carefully over the plank of indexes with different lengths. pd.concat will fill the missing planks with NaN.
  • Watch out for unwanted duplicates gate-crashing due to non-unique indices.
  • Don't get lost in the labyrinth of hierarchical indexing (MultiIndex) if the Series have them. pd.concat will preserve them.