Explain Codes LogoExplain Codes Logo

Calculating a directory's size using Python?

Alex KataevbyAlex Kataev·Jan 25, 2025

If you're looking for an efficient method to calculate directory size in Python, os.walk() and os.path.getsize() combo is the way to go:

import os def get_dir_size(path): # Doing some digital cardio here (walking around your directory) return sum(os.path.getsize(os.path.join(dp, f)) for dp, dn, fn in os.walk(path) for f in fn if os.path.isfile(os.path.join(dp, f))) # Example usage print(f"Directory size: {get_dir_size('/your/directory')} bytes") # When size matters

Leveraging Python's modern tools

A major performance boost can be achieved using os.scandir() and entry.stat().st_size, which is, hands down, the best power couple since peanut butter met jelly:

import os def get_dir_size_fast(path): total_size = 0 with os.scandir(path) as it: for entry in it: if entry.is_file(follow_symlinks=False): # Alright, show off your size! total_size += entry.stat(follow_symlinks=False).st_size return total_size # Example usage print(f"Directory size: {get_dir_size_fast('/your/directory')} bytes") # Faster than a stolen Ferrari

Be wary of symbolic links, they're like that friend who makes lame copies of your jokes. They could lead to duplicated file counting or even infinite recursion:

def calculate_directory_size_no_links(path): total_size = 0 for dirpath, dirnames, filenames in os.walk(path, followlinks=False): for f in filenames: fp = os.path.join(dirpath, f) if not os.path.islink(fp): # Sorry, we don't do photocopies total_size += os.path.getsize(fp) return total_size

Friendly size format

Counts in bytes can cause an information overload. Let's make file sizes more readable:

def human_readable_size(size): # Join me in the showers. It's not rude, it's unit conversion! for unit in ['bytes', 'KB', 'MB', 'GB', 'TB', 'PB']: if size < 1024: return f"{size:.2f} {unit}" size /= 1024 return f"{size:.2f} PB" # "Petabytes" sounds cute, but you don't wanna meet them in a dark alley. # Example usage for human-readable format size_in_bytes = get_dir_size('/your/directory') print(f"Directory size: {human_readable_size(size_in_bytes)}") # Now, in baby language

Python's best-kept secrets

Embracing the 'pathlib' module

The pathlib module makes directory size calculation a walk in the park:

from pathlib import Path def get_dir_size_pathlib(path): # pathlib a day keeps the terminal away return sum(f.stat().st_size for f in Path(path).rglob('*') if f.is_file()) # Example usage print(f"Directory size: {get_dir_size_pathlib('/your/directory')} bytes") # It's Py-magic

The outfit change for output

Some situations call for different units of measurement. Here's how you can easily alter your function's output:

class DirectorySizer: # Directory Sizer: in the end, Size does matter def __init__(self, path): self._bytes = get_dir_size_pathlib(path) @property def kilobytes(self): # Megabytes are overrated return self._bytes / 1024 @property def megabytes(self): # Who's the big boy now? return self._bytes / 1024**2 # ...include more units as deemed fit # Example usage sizer = DirectorySizer('/your/directory') print(f"Directory size: {sizer.megabytes} MB") # Megabytes, the absolute unit

Avoiding os-specific commands

One might seek a quick fix with du -sh using the subprocess module, but I must resist:

import subprocess def get_size_with_du(path): result = subprocess.check_output(['du', '-sh', path]).split()[0].decode('utf-8') return result # Example usage print(f"Directory size: {get_size_with_du('/your/directory')}") # Runs faster but remember it requires Linux

Remember, it's important to pursue cross-platform compatibility. This method falls short as it relies on Unix commands and might not function on Windows.