Explain Codes LogoExplain Codes Logo

A weighted version of random.choice

python
numpy
random-choice
weighted-random-choice
Nikita BarsukovbyNikita Barsukov·Oct 7, 2024
TLDR

To select items using their weights, use random.choices() along with the weights parameter. This built-in function takes care of the probability distribution. Here's a quick example of choosing weighted random items:

import random items = ['A', 'B', 'C', 'D'] weights = [10, 1, 1, 1] selected_item = random.choices(items, weights)[0] print(selected_item)

In this code, 'A' has a higher weight. Note that random.choices() returns a list, so we use [0] to get the single selected item.

For large datasets: Use NumPy

When working with substantial data or needs around precision — NumPy's numpy.random.choice() got you covered. Designate the probabilistic distribution via the p parameter, and decide whether to sample with or without repetition using the replace attribute.

Here's an instance:

import numpy as np items = ['A', 'B', 'C', 'D'] weights = np.array([10, 1, 1, 1]) weights = weights / weights.sum() # Normalize to get a probability distribution, now isn't that weight off your shoulder? selected_item = np.random.choice(items, p=weights) print(selected_item)

When bisecting is not a surgery: Implementing bisect

A surprising application of Python's bisect module is solving our problem here. This is particularly handy when dealing with cumulative weights:

import random import bisect items = ['A', 'B', 'C', 'D'] weights = [10, 20, 30, 40] # Create a cumulative distribution, because sometimes the weight you carry is just...cumulative cum_weights = [sum(weights[:i+1]) for i in range(len(weights))] # Throw a dart in the range of total weights, and hope it doesn't hit your foot rnd = random.uniform(0, cum_weights[-1]) # Find where this "dart" lands in our sorted weights selected_item = items[bisect.bisect(cum_weights, rnd)] print(selected_item)

Readability by zip: The hidden zippiness of Python

Code readability and functionality are two best friends who should never be separated! Use zip() to pair items with their respective weights, increasing the readability and convenience of your function.

def weighted_random_choice(pairs): total = sum(weight for item, weight in pairs) r = random.uniform(0, total) upto = 0 for item, weight in pairs: if upto + weight >= r: return item upto += weight pairs_of_items_and_weights = zip(items, weights) print(weighted_random_choice(pairs_of_items_and_weights))

Goodbye loops! Using NumPy

When a single random selection is needed, loops can be overkill. Always prioritize functions that get the job done in one fell swoop, much like using the np.random.choice method from NumPy when sampling without replacement.

Taking all weight types onboard

Our functions should work with any types of numeric weights. This is important since your weights could be integers, floats, or even complex numbers if you're into that sort of thing.

Digging into the Python documentation

Dive into the official Python documentation while crafting your functions. It's the Hogwarts of Python, filled with examples, best practices, and hidden treasures crucial for your magic spells!

Unique selection with NumPy's replace

Sometimes, you may want to ensure an element isn't selected more than once. Set replace=False in NumPy's numpy.random.choice for unique selections:

import numpy as np items = np.array(['A', 'B', 'C', 'D']) weights = np.array([10, 20, 30, 40]) # Here, replace=False makes 'A' play hard to get. unique_choices = np.random.choice(items, size=2, replace=False, p=weights/weights.sum()) print(unique_choices)