Explain Codes LogoExplain Codes Logo

Python string.replace regular expression

python
regex
string-manipulation
file-handling
Anton ShumikhinbyAnton Shumikhin·Aug 3, 2024
TLDR

When you require regex-based string replacement, look no further — Python's re.sub() is your go-to tool. It leverages advanced pattern matching and processes replacements with dynamic patterns, hence outperforming str.replace() in terms of both flexibility and case-insensitivity.

import re # Using '#' to replace anything related to math in a sentence result = re.sub(r'\d+', '#', 'Hello 123 World 456') print(result)

Output: 'Hello # World #'

Delving into regex specifics and flags

Consider str.replace() as your daily regular coffee, while re.sub() is your fancy Starbucks with all the extra toppings. It takes regular replacements to a next level, unlocking case-insensitive replacements. For repetitive operations, pre-compiling the regex with re.compile() and using re.I flag offer a performance turbo boost!

ch = re.compile(r'coffee', re.IGNORECASE) new_ch = ch.sub('Starbucks', original_sentence) # Regular coffee's no match for Starbucks! ☕

Express regex patterns in raw strings (r"") for a neat and error-free syntax.

Handling files and stringing along lines

When it comes to file manipulation and content replacement, re.sub() is a rock star. Use it within a loop to replace content line-by-line, while your replacement task plays some cool drum-rolls.

with open('file.txt', 'r') as f: content = f.readlines() content = [re.sub(r'<is_despacito>', '<our_new_song>', line) for line in content] with open('file.txt', 'w') as f: f.writelines(content) # Goodbye Despacito, hello our_new_song 🎸

The result? An immaculate file chanting our_new_song instead of Despacito.

Substitution with capturing groups and backreferences

Capturing the essence of re.sub() isn't complete without getting a grasp on capturing groups (surround patterns with ()) and backreferences (\1, \2, ...). They let you perform detour operations like "capture this, replace that here".

result = re.sub(r'(\d+)', r'\1 dollars', 'Free money: 2500') print(result)

Output: 'Free money: 2500 dollars'
Your coffee is liberated from being '2500,' now it is 2500 dollars! 💸

Implementing complex regex substitutions

You unlock regex wizardry as you utilize quantifiers (+, *, ?), character classes ([a-z]), and more to handle complex substitutions.

# Replace all vowel strings with '*' result = re.sub(r'[aeiou]+', '*', 'Becomes Night') print(result)

Output: 'Bcms Nght'
Abracadabra — more vowels please! 🧙‍♀️

The safety and efficiency protocol

re.sub() isn't just powerful; it's also safe. When handling user-gen regex patterns, use re.escape() to prevent misunderstanding any special character. It provides a safety regex seatbelt to ensure a smooth and safe ride.