Explain Codes LogoExplain Codes Logo

Remove all special characters except space from a string using JavaScript

javascript
regex-patterns
string-manipulation
javascript-regex
Alex KataevbyAlex Kataev·Jan 5, 2025
TLDR

To strip special characters except spaces, use .replace(/[^a-zA-Z0-9 ]/g, "") in JavaScript. It sanitizes the string by preserving only alphabets, digits, and spaces:

let sanitized = "String#with$special%chars!".replace(/[^a-zA-Z0-9 ]/g, ""); console.log(sanitized); // Outputs: "String with special chars"

This regex pattern [^ ] means anything that is not inside the square brackets, a-zA-Z0-9 allows alphabets and digits. The space and g denotes a global replacement in the string.

Under the hood of regex patterns

Some variations of the regex pattern can be used based on your specific needs. Let's look at a few different scenarios:

  • Preserve letters and spaces ignoring digits: .replace(/[^a-zA-Z ]/g, "")
  • Preserve only letters remove digits and special characters: .replace(/[^a-zA-Z]/g, "")
  • Preserve word characters (letters, digits, underscores) and spaces: .replace(/[^\w ]/g, "")

Remember, \w in regex signifies word characters (letters, digits, underscores). The caret (^) negates the pattern.

Regex patterns for specific cases

Non-ASCII characters

When dealing with non-ASCII characters like accented letters or other language special characters, we need a different regex:

let sanitized = "Café numéro 42!".replace(/[^a-zA-Z0-9 \u00C0-\u00FF]/g, ""); console.log(sanitized); // Now serving: "Café numéro 42", special characters not included!"

This example includes a Unicode range (\u00C0-\u00FF) which covers Latin-1 Supplement characters, often used in Western European languages.

Libraries for bigger fish

When you need to go beyond the usual catch, i.e., for different spellings and character sets, libraries that handle complex mappings come in handy, such as:

  • he: For all your HTML entity needs.
  • speakingurl: Making URLs readable since.. well, insert date here.
  • mollusc: Alphabets getting out of control? Not anymore!

Embracing diversity with Unicode

To handle the full range of Unicode characters for international applications, you would need to work with advanced features like String.prototype.normalize and Unicode property escapes in regex:

let sanitized = "O’Malley won 1st 😊".normalize("NFD").replace(/[\u0300-\u036f]/g, "").replace(/[^a-zA-Z0-9 ]/g, ""); console.log(sanitized); // Prints: "OMalley won 1st". Emojis have left the chat!

The above code uses Unicode's Normalization Form Canonical Decomposition (NFD) to split graphemes into individual characters prior to applying the regex.

Exploring more with regex

Special character party

Sometimes, you might want to host a small party for your favourite special characters, along with spaces:

  • Spaces and periods: .replace(/[^a-zA-Z0-9 .]/g, "")
  • Spaces and hyphens: .replace(/[^a-zA-Z0-9 -]/g, "")

Performance hustle

Regex can sometimes hog on CPU, especially when the string is lengthy. In such cases, consider breaking up the string, use string methods like indexOf or charAt or preprocess the string to avoid regex.

Non-regex alternatives

Or better yet, ditch the regex ring and go to the functional land in such cases:

let result = "String#with$special%chars!".split('') .filter(c => /[a-zA-Z0-9 ]/.test(c)) .join(''); console.log(result); // "String with special chars" - Clean as a whistle!