Explain Codes LogoExplain Codes Logo

How to search and replace text in a file

python
file-manipulation
error-handling
regular-expressions
Anton ShumikhinbyAnton Shumikhin·Aug 24, 2024
TLDR

Efficiently replace text in a Python script with:

with open('file.txt', 'r+') as file: data = file.read().replace('find', 'replace') # Text being replaced. It's like replacing your mismatched socks. file.seek(0) # Let's move back to the start, like racing a rewind on VHS tape. file.write(data) # Pen down the changes. Remember, the pen is mightier. file.truncate() # Trimming any loose ends.

This snippet opens a file in r+ mode allowing read and write operations safely with automatic file closure. The data variable holds the content of the file which is then modified using file.replace(), replacing the target string ('find').

Attention, large files

For large files, it might not be wise to read the entire content into memory. Here's an optimized version that reads and replaces line by line to prevent a memory overflow.

import fileinput for line in fileinput.input('file.txt', inplace=True): # Old school for loop making a comeback print(line.replace('find', 'replace'), end='')

Using fileinput with inplace=True is like having a road under construction while traffic is still moving.

Interruptions, be gone!

Interruptions during file editing could cause havoc. Come prepped with an error handler to the rescue:

try: with open('file.txt', 'r+') as file: # Your file operations here: Remember Murphy’s law. Anything that can go wrong will go wrong! except IOError as e: print('An I/O error occurred: ', str(e)) # Oh no, an error! Don't panic, just log it.

The pathlib way

In newer Python versions (3.5+), you can join the pathlib club for file manipulations to make it more Pythonic:

from pathlib import Path file_path = Path('file.txt') text = file_path.read_text() text = text.replace('find', 'replace') # Out with the old, in with the new. file_path.write_text(text)

Backup, the safety net

Before any in-place file edits, be the smart guy and make a backup. It's like an insurance policy for your data.

from shutil import copyfile copyfile('file.txt', 'file.txt.bak') # Copy that, literally!

Encodings and irregular replacements

Encodings, the untold story

When dealing with files in non-standard encodings or binary data, specify the encoding:

with open('file.txt', 'r+', encoding='utf-8') as file: # Your file operations here: UTF-8, so every character gets its fair share of attention.

Irregular replacements using regex

For replacing more complex patterns, regular expressions are your friend:

import re with open('file.txt', 'r+') as file: data = file.read() data = re.sub(r'[find_pattern]', 'replace', data) # Regex is the search party on steroids. # Rest of the code remains the same.

Testing and validation

Follow the mantra of code, test, repeat. Always test your code in a sandbox environment before running it on production. Validate the replacement to ensure it meets your expectations.