Explain Codes LogoExplain Codes Logo

How do I trim whitespace?

python
functions
best-practices
dataframe
Nikita BarsukovbyNikita Barsukov·Dec 15, 2024
TLDR

To neatly remove leading and trailing spaces from your string, use Python’s built-in strip() method:

trimmed = " I love Python ".strip() # 'I love Python'

To specifically target either the leading or trailing spaces, make use of lstrip() or rstrip() respectively.

Dealing with 'not only spaces'

Our white-space family extends beyond good old 'spaces'. Fortunately, strip() deftly handles these often-annoying family members: \t, \n, and \r.

multi_trimmed = "\t I'm a tab! No... I'm a space\nOh, wait... it's a newline!\n".strip() # 'I'm a tab! No... I'm a space\nOh, wait... it's a newline!'

Chanelling the spirit of Marie Kondo, strip() diligently removes all forms of whitespace clutter from the start and end while respecting what’s inside.

Annihilate all whitespace

In some cases, you don’t want any whitespace. That's when str.replace() or regex swoops in like a hawk:

no_spaces = "The final frontier, it is.".replace(" ", "") # 'Thefinalfrontier,itis.' # Regex joining the party import re no_whitespaces = re.sub(r'\s+', '', "In space...\tNo one\nCan hear you code.") # 'Inspace...NooneCanhearyoucode.'

Repeat offenders? Use regex compile

When needing to slap the same replacement operation on numerous strings, compile your pattern using re.compile():

pattern = re.compile(r'\s+') # Action time cleaned_1 = pattern.sub('', "One small step for a coder") cleaned_2 = pattern.sub('', "One giant leap for coderkind") # 'Onesmallstepforacoder', 'Onegiantleapforcoderkind' # Neil Armstrong would be proud!

Addressing multiline strings

When dealing with multiline strings, splitlines() becomes a real diamond in the rough:

lines = " First line\n2 + 2 = 4. Right?\n".splitlines() endproduct = [line.strip() for line in lines] # ['First line.', '2 + 2 = 4. Right?'] # Math checks out. Now, beer me.

Just in case you're sentimentally attached to the newline characters, just pass True to splitlines():

lines_with_eol = " Wait!\n I miss my new lines. ".splitlines(True) # [' Wait!\n', ' I miss my new lines. '] # They’re@baaaack...

Battle against file data

Files can often act like they're on a whitespace diet. They sneak whitespace in line by line. Here's how you tackle it:

with open('data.txt', 'r') as file: stripped_content = [line.strip() for line in file] # Hide and seek, whitespace edition

Whitespace in data structures

When dealing with lists or dictionaries with whitespace, apply strip() within list comprehensions or dict comprehensions to cleanse each element:

# Classy lists clean_list = [elem.strip() for elem in messy_list] # Dignified dictionaries clean_dict = {k.strip(): v.strip() for k, v in messy_dict.items()}