How to search for a string in text files?

python

prompt-engineering

lazy-loading

regex

byAlex Kataev·Feb 21, 2025

Easily find a particular string in a text file using Python's with open() function paired with the in operator:

with open('example.txt') as file: # opening the file 
    content = file.read() # reading file content into 'content'
    print('Found!' if 'search_term' in content else 'Not Found!') # cheeky search

Above code simply opens a specified file "example.txt", diligently hunts for the 'search_term', and proudly announces the verdict.

Dealing with hefty files

While dealing with larger-than-life files, loading the entire thing into the memory can be a drag, trust me. Fret not though, we can use our hero function mmap.mmap() to create a memory-mapped file object that lets you search through your vast file without the need to fully read it into memory:

import mmap # not just a map to Hogwarts

with open('example.txt', 'r+b') as file: # a gentle open
    with mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as mm: # magic with mmap
        if b'search_term' in mm: # the search begins
            print('Found!') # We have a winner!
        else:
            print('Not Found!') # Better luck next time

Isn't it amazing? A quicker and much more memory-friendly way, especially when you are dealing with extra large files.

Harnessing power of regular expressions

When the task at hand is to perform power searches, like indulgent case-insensitive matching or crafting complex patterns, regular expressions come swooping in like Superman. Make use of re.search for such exotic scenarios:

import re # rodeo at rescue

with open('example.txt') as file: # Zen style opening
    content = file.read() # reading content, line by line.
    if re.search(r'(?i)search_term', content): # the regex magic
        print('Found with regex!') # Gotcha!
    else:
        print('Not Found!') # Nope, not today.

Don't miss the (?i) before the search term, it ensures case-insensitive search. Case-insensitivity never got cooler, did it?

Tackling peculiar cases

Now, let's tango with different scenarios and learn how to address them beautifully:

Scope limited to a single line: You deceive the entire file to only look into each line.
Error management: Graciously handling errors for a foolproof impelementation.
The character encoding mystery: Different files, different encodings.

Seeking within a single line

Why load the whole file when you just want a line or two:

with open('example.txt') as file: # gentle giant
    for line in file: # one by one, please!
        if 'search_term' in line: # In-line chit-chat
            print('Found in line!') # Eureka!
            break
    else:
        print('Not Found in any line!') # A dry day, today.

Error handling masterclass

Flaunt your exception handling skills for robustitude:

try:
    with open('example.txt') as file: # Trying to open the door
        content = file.read() # Content has been captured
        print('Found!' if 'search_term' in content else 'Not Found!') # Peek-a-boo
except FileNotFoundError: # Oops, file not found!
    print('example.txt not found!') # Polite error message

Decipher the character encoding

When opening Pandora's box, always remember to specify the apt encoding:

with open('example.txt', encoding='utf-8') as file: # Keeping 'utf-8' in mind
    content = file.read() # Secure the content
    print('Found!' if 'search_term' in content else 'Not Found!') # Voila!

Cracking the multi-file puzzle

Frequently, the task is to investigate multiple files. Our friend glob module arrives to help with file path pattern-matching:

import glob # Globetrotters at your service

for filename in glob.glob('*.txt'): # Using the glob magic
    with open(filename) as file: # Open sesame
        if 'search_term' in file.read(): # Seeking the term
            print(f'Found in {filename}!') # Got it!

Ta-da! This code deftly checks all .txt files in the directory.

Laying down efficiency tips

Taking note of some vital performance optimizations:

Lazy loading: Handle each line individually for memory savings.
RegEx compilation: Precompile for haste when using the same pattern.
Reading in chunks: Break down large files into manageable chunks.

Lazy loading like a pro

To preserve memory resources, handle each line individually. Here's how:

def search_in_file(file_path, search_term): # Define the search function
    with open(file_path) as file: # Gentle open
        for line in file: # Flip through the lines
            if search_term in line: # Found it?
                return True # Oh yeah!
    return False # Nope, not here!

Precompiling regular expressions

Precompile and store the regex for multiple uses, just like cookies:

pattern = re.compile(r'(?i)search_term') # Cookie baked and stored

with open('example.txt') as file: # Gentle open
    for line in file: # One page flip at a time
        if pattern.search(line): # Cookie does its magic
            print('Found with precompiled regex!') # Yum!

Handle big files like a piece of cake

Read large files in digestible chunks for a breezy memory handling:

def find_in_chunk(file_path, search_term, chunk_size=1024): # Define the modular chunk search
    with open(file_path) as file: # Open the box
        while True: # Loop until the end of time
            chunk = file.read(chunk_size) # Minimalist reading
            if search_term in chunk: # The term reveals itself
                return True # Gotcha!
            if not chunk:  # Whoops, end of the file!
                return False # No luck!