Get human readable version of file size?

python

functions

performance

best-practices

byAnton Shumikhin·Oct 17, 2024

Python recipe for converting file size to a human-readable format:

def human_size(size):
    # Define units like the Barenaked Ladies... it's been one week, folks!
    units = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
    for unit in units:
        if size < 1024:
            return f"{size:.1f} {unit}"
        size /= 1024
print(human_size(1536))  # "1.5 KB"

This function chews through units, dividing the size until it's smaller than 1024, then pretty-prints it with one decimal place.

Deep dive: Binary vs. Decimal, big file sizes, and negative numbers

The first version works like a charm for casual conversions, but it's time to delve deeper. We need to take care of negative numbers (which might hint at over-committed storage spaces or errors) and extremely large sizes. Also, we should address the difference between binary and decimal prefixes for file size units.

Here's an amped-up function for such intricacies:

import math

def human_size(size, decimal_places=1):
    # Defining unit prefixes in the binary system (power of 1024)
    units = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']
    if size < 0:
        sign = "-"
        size = -size
    else:
        sign = ""
    
    if size < 1:
        return f"{sign}0 B"
    
    # Logarithm magic to find the appropriate unit
    i = int(math.log(size, 1024))
    # Scaling size down
    p = math.pow(1024, i)
    s = round(size / p, decimal_places)
    return f"{sign}{s} {units[i]}"

print(human_size(1536))                   # "1.5 KiB"
print(human_size(5294967296))             # "4.9 GiB"
print(human_size(-1024))                  # "-1.0 KiB"
print(human_size(1234567890000000000000)) # "1.1 YiB" – Take that, Universe!

This version has got you covered from single bytes to yottabytes – That's, like, a lot of cat videos!

Exception handling: Zero and single bytes

For file sizes less than 1 byte or specifically zero, we give them special treatment, so they don't feel left out.

def human_size(size, decimal_places=1):
    # ... [abbreviated for brevity]
    
    if size == 0:
        return "0 B"
    elif size == 1:
        return "1 Byte"  # Because grammar matters… Yes, it does…
    
    # ... [rest of the function]

Smarter work: Leveraging libraries

In some cases, using an existing library, like humanize, can make your life a whole lot easier:

from humanize import naturalsize

print(naturalsize(1536))  # "1.5 KB"
print(naturalsize(1048576, binary=True))  # "1.0 MiB" – Doesn't get easier!

It provides the values in both binary and decimal units without having you to write additional lines of code.

Trick-or-treat: Bitwise operations and Recursion

Super-efficient bitwise operations

For better performance with large files, bit-level operations can divide numbers faster than traditional methods:

def human_size(size, decimal_places=1):
    # ... [abbreviated for brevity]
    
    # Ghost in the machine! Don't be scared, it's just binary logic.
    i = (size.bit_length() - 1) // 10
    p = 1 << (i * 10)
    
    # ... [rest of the function]

This version utilizes bit_length() and bit shifting, taking advantage of the binary nature of the file sizes.

Elegance of recursions

A recursive design simplifies the iterations:

def human_size_recursive(size, units=None, step=0):
    if units is None:
        # Like a recurring dream, but it's really handy here.
        units = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB']
    if size < 1024 or step == len(units) - 1:
        return f"{size:.{decimal_places}f} {units[step]}"
    return human_size_recursive(size / 1024, units, step + 1)

Recursion can make small byte-sized pieces out of the beastliest file sizes – much easier to read, too!

Practical applications

User interface: File size labels

A file size function can be integrated with actual files for providing meaningful information to users:

def file_size_label(file_path):
    size = os.path.getsize(file_path)  # Getting file size
    # Applying conversion
    return f"The file size is: {human_size(size)}"

print(file_size_label("/path/to/your/file.txt"))  # "The file size is: 5.2 MB"

Tailored formatting

You can offer customizable options for unit abbreviations or decimal places to align with design or user requirements:

def human_size(size, decimal_places=1, abbreviate=True):
    # ... [abbreviated for brevity]
    
    unit = units[i]
    if abbreviate:
        unit = unit[0] if unit != "B" else unit  # 'K' for 'KiB'
    # Serving the size, just the way you like it.
    return f"{sign}{s} {unit}"

print(human_size(1048576, decimal_places=2, abbreviate=False))  # "1.00 MiB"
print(human_size(1048576, abbreviate=True))                     # "1.0 M"