Shuffle an array with python, randomize array item order with python

python

shuffle

randomization

dataframe

byNikita Barsukov·Jan 2, 2025

To mix up or shuffle an array in Python use the built-in random.shuffle() function. The beauty of this function is it performs the shuffle in-place:

import random

array = [1, 2, 3, 4, 5]
random.shuffle(array) # Shakes the array like a snow globe
print(array)  # Output example: [3, 5, 1, 4, 2] - as random as a dice roll

This quick and dirty in-place shuffling method is efficient for memory since it does not make additional copies of the array!

Shuffling the immutable (becoming mutable)

If you are dealing with data that should remain unchanged (immutable) or if the original array needs to survive the shuffle unscathed, Python's got your back with random.sample():

import random

array = [1, 2, 3, 4, 5]
shuffled_array = random.sample(array, len(array)) 
# "Stay right there array, we got you a stunt double for this dangerous scene!"
print(shuffled_array)  # Output example: [2, 4, 5, 3, 1]

This move gives you a new, shuffled copy of the original array, keeping the original array as safe as a bear in a bulletproof vest!

Sklearn shuffle: The consistent comrade

When dealing with related arrays (like in machine learning scenarios), it's crucial to shuffle these arrays together. Pair programming? Try pair shuffling with sklearn.utils.shuffle:

from sklearn.utils import shuffle

X = [1, 2, 3, 4, 5]
y = ['a', 'b', 'c', 'd', 'e']

X, y = shuffle(X, y, random_state=42) # Like a pair of sock from the dryer without losing one!
print(X, y)  # Probable Output: [4, 3, 1, 5, 2] ['d', 'c', 'a', 'e', 'b']

Setting a random_state ensures repeatable randomness, crucial for consistency in experiments or testing.

Tailor-made shuffles

Need more control over your shuffles? Want to tell Python how exactly to shuffle your array? Custom shuffling function to the rescue:

import random

def custom_shuffle(array):
    # Your custom shuffling logic here, go wild!
    random.seed(42)  # for reproducibility
    # "Because random isn't quite random enough without a seed!" 😄
    return shuffled_array

array = [1, 2, 3, 4, 5]
print(custom_shuffle(array))  # Your array, your shuffling rules, your world! 😎

This flexibility allows you to implement any algorithm of your choice for array shuffling.

Alternatives and special scenarios

Depending on your data and use case, different shuffling methods might be more suitable:

ND-array shuffling using NumPy

For dealing with large multi-dimensional arrays, np.random.shuffle can do the job efficiently:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr) # Ndarray, prepare to be atomized! 💥
print(arr)  # Shuffled NumPy array, as ordered as a college kid's room

Pandas DataFrame shuffling

Tabular data can be shuffled within a pandas DataFrame using DataFrame.sample:

import pandas as pd

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': ['a', 'b', 'c', 'd', 'e']})
shuffled_df = df.sample(frac=1).reset_index(drop=True)
# "Hey rows, switch places!" (randomly of course)
print(shuffled_df)

Note: frac is the fraction of rows to return in the shuffled array. Setting frac=1 tells pandas to use all rows.

Secure shuffling with secrets

In a security-sensitive context, secrets module provides better randomness:

import secrets

def secure_shuffle(array):
    a = array[:]
    for i in range(len(a)):
        swap_idx = secrets.randbelow(i + 1)
        a[i], a[swap_idx] = a[swap_idx], a[i]
    return a

array = [1, 2, 3, 4, 5]
print(secure_shuffle(array))  # "Agent Array, your mission, should you choose to accept it, is to shuffle securely!"