Explain Codes LogoExplain Codes Logo

Check if multiple strings exist in another string

python
functions
list-comprehensions
regular-expressions
Nikita BarsukovbyNikita BarsukovΒ·Jan 12, 2025
⚑TLDR

Check for multiple substrings in a string by harnessing the power of Python's all() function, easily and pretty quickly:

substrings = ['hello', 'world'] text = "Hello world, let's learn Python!" # TP means "True/Positive" if all(s in text for s in substrings): # Text must contain all TP substrings 🎯 print("All aboard!") else: print("Some are MIA.") # MIA - missing in action πŸ”Ž

This smart one-liner makes use of the short-circuiting feature in Python, i.e., it stops checking once a substring isn't found, thus optimizing performance.

Check for any of the strings with any()

To further mine the gold of Python functions and check if any of the strings are present, any() is your buddy. It returns True if at least one substring exists within the text:

substrings = ['probably', 'might'] text = "This text might have some of these substrings." # Reusing the TP variable; because recycling is in 🌳 if any(s in text for s in substrings): print("We have at least one TP!") else: print("Houston, we have a zero.") # Always trust a line taken from a Hollywood movie 🌌

You can also combine these two functions with list comprehensions, striking a perfect balance between efficiency and specificity for your presence check.

Finding patterns with regular expressions (regex)

For finding multiple occurrences or intricate patterns, regular expressions shine brighter than the North Star. Simply join your substrings into a regex pattern using the | (logical OR):

import re # List of strings that have gone missing πŸ” substrings = ['hello', 'world'] text = "Hello world, let's learn Python!" pattern = '|'.join(re.escape(s) for s in substrings) # Dispatching the regex to search & rescue missing strings πŸ‘€ matches = re.findall(pattern, text, re.IGNORECASE) if matches: print(f"Found: {set(matches)}") else: print("No survivors.") # A line just grim enough to make you remember regexes

Notice the use of re.IGNORECASE for a case-insensitive search. After all, cases may lie, but patterns always speak the truth.

Aho-Corasick: The efficiency wizard

The Aho-Corasick algorithm is your best bet when dealing with a large number of substrings and a gigantic search space. Although it’s missing in the base Python installation, you can add it using libraries like ahocorasick:

import ahocorasick A = ahocorasick.Automaton() for idx, word in enumerate(substrings): A.add_word(word, idx) A.make_automaton() for item in A.iter(text): print(f"Detected: {substrings[item[1]]}") # See I told you, efficiency wizard! πŸ§™β€β™‚οΈ

Unique is the new cool

Sometimes you need to verify your matched substrings are unique. Use set operations, specifically the intersection method:

unique_matches = set(substrings) & set(matches) # & operator divulges common elements print(unique_matches) # Prints the unique, the rare, the special

If duplicates are important for your task, list comprehension has got your back. It retrieves all occurrences, including duplicates.

Zipping through single-character string check with set operations

When you deal with single-character strings, using set for the string can bring the speed of your comparison up to the level of F1 race cars:

char_set = {'a', 'b', 'c'} text = "I'm just a regular old text." # By comparing sets, we practically engage in a speed typing contest against a Python. No harm done to the reptile 🐍 found_chars = char_set & set(text) print(f"Located characters: {found_chars}") # Exhibits the victorious chars