Explain Codes LogoExplain Codes Logo

Shuffle an array with python, randomize array item order with python

python
shuffle
randomization
dataframe
Nikita BarsukovbyNikita Barsukov·Jan 2, 2025
TLDR

To mix up or shuffle an array in Python use the built-in random.shuffle() function. The beauty of this function is it performs the shuffle in-place:

import random array = [1, 2, 3, 4, 5] random.shuffle(array) # Shakes the array like a snow globe print(array) # Output example: [3, 5, 1, 4, 2] - as random as a dice roll

This quick and dirty in-place shuffling method is efficient for memory since it does not make additional copies of the array!

Shuffling the immutable (becoming mutable)

If you are dealing with data that should remain unchanged (immutable) or if the original array needs to survive the shuffle unscathed, Python's got your back with random.sample():

import random array = [1, 2, 3, 4, 5] shuffled_array = random.sample(array, len(array)) # "Stay right there array, we got you a stunt double for this dangerous scene!" print(shuffled_array) # Output example: [2, 4, 5, 3, 1]

This move gives you a new, shuffled copy of the original array, keeping the original array as safe as a bear in a bulletproof vest!

Sklearn shuffle: The consistent comrade

When dealing with related arrays (like in machine learning scenarios), it's crucial to shuffle these arrays together. Pair programming? Try pair shuffling with sklearn.utils.shuffle:

from sklearn.utils import shuffle X = [1, 2, 3, 4, 5] y = ['a', 'b', 'c', 'd', 'e'] X, y = shuffle(X, y, random_state=42) # Like a pair of sock from the dryer without losing one! print(X, y) # Probable Output: [4, 3, 1, 5, 2] ['d', 'c', 'a', 'e', 'b']

Setting a random_state ensures repeatable randomness, crucial for consistency in experiments or testing.

Tailor-made shuffles

Need more control over your shuffles? Want to tell Python how exactly to shuffle your array? Custom shuffling function to the rescue:

import random def custom_shuffle(array): # Your custom shuffling logic here, go wild! random.seed(42) # for reproducibility # "Because random isn't quite random enough without a seed!" 😄 return shuffled_array array = [1, 2, 3, 4, 5] print(custom_shuffle(array)) # Your array, your shuffling rules, your world! 😎

This flexibility allows you to implement any algorithm of your choice for array shuffling.

Alternatives and special scenarios

Depending on your data and use case, different shuffling methods might be more suitable:

ND-array shuffling using NumPy

For dealing with large multi-dimensional arrays, np.random.shuffle can do the job efficiently:

import numpy as np arr = np.array([1, 2, 3, 4, 5]) np.random.shuffle(arr) # Ndarray, prepare to be atomized! 💥 print(arr) # Shuffled NumPy array, as ordered as a college kid's room

Pandas DataFrame shuffling

Tabular data can be shuffled within a pandas DataFrame using DataFrame.sample:

import pandas as pd df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': ['a', 'b', 'c', 'd', 'e']}) shuffled_df = df.sample(frac=1).reset_index(drop=True) # "Hey rows, switch places!" (randomly of course) print(shuffled_df)

Note: frac is the fraction of rows to return in the shuffled array. Setting frac=1 tells pandas to use all rows.

Secure shuffling with secrets

In a security-sensitive context, secrets module provides better randomness:

import secrets def secure_shuffle(array): a = array[:] for i in range(len(a)): swap_idx = secrets.randbelow(i + 1) a[i], a[swap_idx] = a[swap_idx], a[i] return a array = [1, 2, 3, 4, 5] print(secure_shuffle(array)) # "Agent Array, your mission, should you choose to accept it, is to shuffle securely!"