Explain Codes LogoExplain Codes Logo

How do I add an extra column to a NumPy array?

python
numpy
array-manipulation
dataframe
Alex KataevbyAlex Kataev·Oct 24, 2024
TLDR

Join a 2D NumPy array with a fresh column in one fell swoop using numpy.column_stack:

import numpy as np # Your array arr = np.array([[1, 2], [3, 4]]) # Want to add a column? new_col = np.array([5, 6]) # Presto! arr_expanded = np.column_stack((arr, new_col))

Got structured arrays with field names? Here's your solution numpy.lib.recfunctions.append_fields:

from numpy.lib import recfunctions as rfn # A nice structured array dtype = [('col1', int), ('col2', int)] arr = np.array([(1, 2), (3, 4)], dtype=dtype) # Boom, another column! arr_expanded = rfn.append_fields(arr, 'col3', [5, 6], usemask=False)

Enjoy your array-fitti: an array with the new column finely appended.

Painting scenarios: different strokes for different folks

Adapting to your needs and your array's structure, the technique to append an extra column can take another hue. Let's paint a few popular scenarios:

Catering to numeric arrays

Need to append a column of zeros? Here is your happy little shortcut:

# Voila! a column of zeroes arr_with_zeros = np.c_[arr, np.zeros(arr.shape[0])] # Alternatively, let's hstack for a more Picasso-esque approach arr_with_zeros = np.hstack((arr, np.zeros((arr.shape[0], 1))))

These zero heroes need the data types to match though. So match them you must!

Van Gogh'ing it for large datasets

Performance is your muse, and you yearn for the fastest method. Tools like perfplot are your palette to benchmark different ways like np.hstack, np.column_stack, or even direct assignment.

# How about some happy little arrays? new_arr = np.zeros((arr.shape[0], arr.shape[1] + 1)) new_arr[:,:-1] = arr

Preserving array contiguity: a still life

Memory layout is your canvas where you want to maintain contiguity. While np.vstack could be faster, it might leave your canvas splattered, not preserving this property.

On adding a splash of multiple columns or unique types

If multi-column is your style

Handle np.concatenate like a paintbrush by merging your additional columns into a multidimensional array:

# Wish granted, two new columns! extra_cols = np.array([[5, 6], [7, 8]]) arr_expanded = np.concatenate((arr, extra_cols), axis=1)

If 'different data types' is your aesthetic

Welcome compatible types to your palette. In the structured arrays world, rfn.append_fields is your paint tube that blends mixed types with elan.

The subtle art of array manipulations

Indexing: a colorful aspect

When extending an array, remember that 1:4 turns into slice objects when framed in brackets—[]. These slices offer a more vibrant palette for array manipulations than mere integer indices.

Array broadcast rules: painting the rules

When the act of adding columns roams across differently shaped canvases, NumPy's broadcasting rules come to life. A common smudge occurs when trying to add a 1D column to a 2D array. Shape your column vector into a 2D column matrix for a smooth texture:

# Correctly spinning a new column onto palette new_col_shaped = new_col.reshape(-1, 1) arr_expanded = np.hstack((arr, new_col_shaped))

Preserve original masterpieces

It's a prized practice to preserve the original array. All these methods create a doppelganger, leaving your masterpiece unscathed.