Explain Codes LogoExplain Codes Logo

How to convert 'binary string' to normal string in Python3?

python
encoding
decoding
utf-8
Alex KataevbyAlex Kataev·Feb 16, 2025
TLDR

Decode binary string by parsing 8-bit chunks to integers using int(binary_chunk, 2), then convert them to characters with chr(). Finally, collect them together using a list comprehension:

binary_string = '0100100001100101011011000110110001101111' normal_string = ''.join(chr(int(binary_string[i:i+8], 2)) for i in range(0, len(binary_string), 8)) print(normal_string) # Output: Hello (#hello, am in binary world!)

Encoding and decoding in detail

Turning normal strings to binary

To unravel how binary strings come about, let's see how strings get encoded to binary. Encoding morphs a string into bytes (basically express shipping to Binary-ville 📦):

text = 'Hello' binary_encoded = text.encode('ascii') # or 'utf-8' print(binary_encoded) # Output: b'Hello' (Whoa! Text just got a prefix haircut!)

Remember, encode() demands a character encoding. Common choices include 'ascii' and 'utf-8', with the latter being more inclusive of diverse symbols.

From binary strings back to normal

Decoding reverts binary strings back into the realm of human-readable text. Do remember to use the same encoding used during the encoding process:

binary_text = b'0100100001100101011011000110110001101111' normal_text = binary_text.decode('utf-8') print(normal_text) # Output: Hello (Welcome back to normality!)

Beware the 'b' prefix

The 'b' prefix is binary's nametag in Python, appearing when bytes are in play. Encoding appends the 'b' prefix, while decoding politely dismisses it.

When ASCII isn't enough

But wait! What about non-ASCII or other unusual encodings? Fear not, Python has a special codecs library for those:

import codecs # Exotic encoding like 'hex' or 'base64': binary_string = '68656c6c6f' # 'hello' in hex normal_string = codecs.decode(binary_string, 'hex').decode('utf-8') print(normal_string) # Output: hello (Hello from the other side...of encoding!)

Advanced scenarios and curveballs

Binary padding

Sometimes binary strings have extra padding to fill up to the nearest 8-bit boundary. When converting, either remove the padding or account for it:

binary_string_padded = '0011000100110010' # Looks too puffed up... binary_string = binary_string_padded.lstrip('0') # Shedding some unnecessary padding # Proceed with the same conversion process

Special characters, Emojis, and UTF-8

For strings spiced up with special characters or emojis, 'utf-8' is your encoding go-to, because nobody wants errors with their strings:

special_string = 'Hello, 🌍!' # Apparently, the world is responding! binary_encoded = special_string.encode('utf-8') # Encode to bytes... normal_string = binary_encoded.decode('utf-8') #...and decode back to string. All while string's on vacation. ¯\_(ツ)_/¯

Error handling: a necessity

Conversion into integers or characters may hit bellies if the binary sequence is not clean. Use try-except blocks to handle these bumps:

try: # Your conversion code here except ValueError: # Binary input was polluting the pool print("The binary input was not in the correct format.") except TypeError: # Someone threw a non-binary value in there! print("A non-binary value was provided.")