Explain Codes LogoExplain Codes Logo

Multiprocessing: Use tqdm to display a progress bar

python
multiprocessing
tqdm
concurrency
Anton ShumikhinbyAnton Shumikhin·Mar 1, 2025
TLDR

Fasten your seatbelts, we're going parallel! For a progress bar in multiprocessing tasks with tqdm, use the concept of shared resources. Here's how:

from multiprocessing import Process, Manager, Lock from tqdm import tqdm def worker(counter, lock): # Yeah! I'm the worker doing all the hard work! with lock: # Lock acquired, update the counter. counter.value += 1 if __name__ == "__main__": manager = Manager() counter = manager.Value('i', 0) # Lock to protect counter.value from simultaneous access lock = Lock() workers = 10 # tqdm acting like the popcorn guy, showing you the progress. with tqdm(total=workers) as progress_bar: # Let's create those workers. (No Oompa-Loompas were harmed.) processes = [Process(target=worker, args=(counter, lock)) for _ in range(workers)] for p in processes: p.start() # tqdm adding butter to popcorn while the movie (process) is still running. while any(p.is_alive() for p in processes): progress_bar.update(n=counter.value - progress_bar.n) # Don't leave anyone behind! for p in processes: p.join()

This code successively updates the tqdm progress bar in real-time as worker processes increment the shared counter. No time.sleep() needed, we aren't rip van winkle!

When to use multiprocessing.Pool imap, imap_unordered?

Multiprocessing with Pool.imap or Pool.imap_unordered can help squeeze out every drop of performance from your CPU. Let's see how you can use them both.

Ordered Progress Bar with imap

What's imap? It's a sexy version of map function for multiprocessing. It returns an iterator that delivers results in order. Let's spice things up with tqdm:

from multiprocessing import Pool from tqdm import tqdm def heavy_lifter(x): # Lifting weights and building muscles! return x * x if __name__ == '__main__': pool_size = 4 tasks = range(1000) with Pool(pool_size) as pool: # Inboxing the results served hot by imap! results = list(tqdm(pool.imap(heavy_lifter, tasks), total=len(tasks)))

Speed Lover? Use imap_unordered

For those who love the 'Fast and Furious' style, imap_unordered fires off results as soon as they're ready without bothering about the order.

# All you need to do is replace pool.imap with pool.imap_unordered. results = list(tqdm(pool.imap_unordered(heavy_lifter, tasks), total=len(tasks)))

Boost your Progress Bar with process_map

If you don't mind sidekicks, tqdm.contrib.concurrent.process_map is your Robin. The function elegantly wraps around your functions enabling them for concurrent progress tracking.

from tqdm.contrib.concurrent import process_map # Robin dives into action with max_workers and chunksize set to performance. results = process_map(heavy_lifter, tasks, max_workers=pool_size, chunksize=10)

This needs tqdm version 4.42.0 or higher. Always keep your packages updated, a stitch in time saves nine!

The Pitfalls (and how to jump over them!)

Assembled below are some key points to consider, a kind of multiprocessing FAQ:

Order vs Speed

Choose right between order (imap) and speed (imap_unordered). The rabbit ain't always the winner!

Timing Inconsistencies

imap_unordered might pull off some magic tricks on time estimation. Don't be surprised!

Right method for the right job

Pool has many more methods (apply_async, etc). Keep exploring and choose the one that suits your needs.

Always clean up!

Enclosing Pool within a context manager ensures proper cleanup. No rubbish left behind!

Step up your game with external libraries

Look beyond the standard library. There is a world waiting to be explored!

Check out p_tqdm

p_tqdm integrates multiprocessing with progress tracking. Follow them on GitHub!

Think Chunksize

Chunksize is an incredible knob real-time tuning your program's performance. Turn it!

Be Updated

Ensure you have the latest tqdm version. Subtle improvements can deliver significant gains.