Remove all special characters except space from a string using JavaScript
To strip special characters except spaces, use .replace(/[^a-zA-Z0-9 ]/g, "")
in JavaScript. It sanitizes the string by preserving only alphabets, digits, and spaces:
This regex pattern [^ ]
means anything that is not
inside the square brackets, a-zA-Z0-9
allows alphabets and digits. The space and g
denotes a global replacement in the string.
Under the hood of regex patterns
Some variations of the regex pattern can be used based on your specific needs. Let's look at a few different scenarios:
- Preserve letters and spaces ignoring digits:
.replace(/[^a-zA-Z ]/g, "")
- Preserve only letters remove digits and special characters:
.replace(/[^a-zA-Z]/g, "")
- Preserve word characters (letters, digits, underscores) and spaces:
.replace(/[^\w ]/g, "")
Remember, \w
in regex signifies word characters (letters, digits, underscores). The caret (^
) negates the pattern.
Regex patterns for specific cases
Non-ASCII characters
When dealing with non-ASCII characters like accented letters or other language special characters, we need a different regex:
This example includes a Unicode range (\u00C0-\u00FF
) which covers Latin-1 Supplement characters, often used in Western European languages.
Libraries for bigger fish
When you need to go beyond the usual catch, i.e., for different spellings and character sets, libraries that handle complex mappings come in handy, such as:
he
: For all your HTML entity needs.speakingurl
: Making URLs readable since.. well, insert date here.mollusc
: Alphabets getting out of control? Not anymore!
Embracing diversity with Unicode
To handle the full range of Unicode characters for international applications, you would need to work with advanced features like String.prototype.normalize
and Unicode property escapes in regex:
The above code uses Unicode's Normalization Form Canonical Decomposition (NFD) to split graphemes into individual characters prior to applying the regex.
Exploring more with regex
Special character party
Sometimes, you might want to host a small party for your favourite special characters, along with spaces:
- Spaces and periods:
.replace(/[^a-zA-Z0-9 .]/g, "")
- Spaces and hyphens:
.replace(/[^a-zA-Z0-9 -]/g, "")
Performance hustle
Regex can sometimes hog on CPU, especially when the string is lengthy. In such cases, consider breaking up the string, use string methods like indexOf
or charAt
or preprocess the string to avoid regex.
Non-regex alternatives
Or better yet, ditch the regex ring and go to the functional land in such cases:
Was this article helpful?