Explain Codes LogoExplain Codes Logo

Check if string ends with one of the strings from a list

python
string-ends-with
case-insensitive-check
regular-expressions
Alex KataevbyAlex Kataev·Aug 22, 2024
TLDR

Check whether a string concludes with a given choice of substrings using str.endswith(), which is capable of accepting a tuple:

choices = ('.txt', '.doc', '.pdf') file_name = 'report.doc' if file_name.endswith(choices): print('Alive and kicking, document format!')

This efficiently juxtaposes the file_name with a tuple of choices, confirming it as a valid document format.

Expanded answer

Case-insensitive check

For a case-insensitive check:

choices = ('.txt', '.doc', '.pdf') file_name = 'report.DOC' if file_name.lower().endswith(choices): print('Have no fear, the document format is here!')

By invoking lower() on file_name, we ensure the matching operation remains case insensitive.

Regular expressions for intricate patterns

If you've got some fancy footwork in your patterns, regex is your go-to dance partner:

import re choices = ['\.txt$', '\.doc$', '\.pdf$'] file_name = 'archive.pdf' if any(re.search(pattern, file_name, re.IGNORECASE) for pattern in choices): print('Cut the check, this is a valid document!')

The re.IGNORECASE option enables case-insensitive searches. Regex handles more convoluted patterns like a champ.

Code performance with timeit

Is your code more Usain Bolt or more tortoise? Use timeit to measure performance:

import timeit # Testing the speed of our methods, no steroids involved! timeit.timeit("file_name.lower().endswith(choices)", setup="file_name = 'example.DOC'; choices = ('.txt', '.doc', '.pdf')", number=10000) timeit.timeit("any(re.search(pattern, file_name, re.IGNORECASE) for pattern in choices)", setup="import re; file_name = 'example.DOC'; choices = ['\.txt$', '\.doc$', '\.pdf$']", number=10000)

Opt for the most effective method, balancing performance and code readability.

Optimization for multiple checks

If you've got a bunch of strings to check - no, we're not at a puppet show - you can optimize:

def is_valid_format(file_name, extensions): return file_name.lower().endswith(tuple(extensions)) file_list = ['lord_of_the_rings.doc', 'harry_potter.jpg', 'game_of_thrones.pdf'] valid_formats = ('.txt', '.doc', '.pdf') valid_documents = filter(lambda f: is_valid_format(f, valid_formats), file_list) print(list(valid_documents))

The filter function paired with a lambda gives us a list of matching string endings faster than you can say "Expecto Patronum!".

Beyond the basics

Splitting file name and extension

Separating the file name and extension can be efficiently done using os.path.splitext:

import os file_name = 'example.doc' root, ext = os.path.splitext(file_name) if ext in choices: print('Goal! Valid document format.')

This method ensures accurate extraction of extensions - like a pro!

Pythonic code applications

Pythonic code is all about clean and concise expressions. Swap the old for loop with:

# List comprehension. Because nobody likes the guy who brings an essay to a bullet-point fight! is_valid = any(file_name.lower().endswith(ext) for ext in choices)

Swapping for loops with list comprehension keeps the syntax neat and tidy, and your code reviewers happy.