Capitalize words in string
To capitalize each word in a string, utilize .replace()
with a regular expression:
Here, the beginning of each word (\b\w
) is targeted and swapped to uppercase (c.toUpperCase()
), resulting in a string where every word is inaugurated by an uppercase letter.
Capitalization with edge cases
The fast answer provides a quick solution for the common cases, but let's toughen it up to handle some edge cases that include punctuation, special characters, and international symbols.
Sailing through punctuation and special characters
When tussling with punctuation or braces, the \b\w
might lose balance. To circumvent this, let's refine the regex:
Non-capturing groups (?:...)
are used, adding conditions for uppercase letters to conserve the original capitalization.
Brushing up on national symbols
Characters outside the basic ASCII set might be left in the dust due to typical regex patterns. Let's offer a ride to the non-ASCII characters:
By matching any kind of letter from any language, \p{L}
is added to our regex. normalize()
helps us retain the system of accented symbols.
Improving performance and versatility
We aim for our solution to be capable of juggling strings of diverse lengths and robust enough to catch all types of edge cases.
Standardizing to lowercase
Before capitalization, it's a good practice to guide the string to a lowercase path:
By utilizing toLowerCase()
, we establish a uniform base on which we will apply toUpperCase()
for each word.
A slice of Map and Join
If you're more into Map and Join methods and giving regex a hard pass, check this method out:
Despite being a bit lax on the performance aspect for long strings, this method is high on code clarity.
Building a robust capitalization function
Better than a one-trick pony, we strive to make our capitalization function versatile and a team player in an existing codebase. Here's how to do that:
This function gives the pliability to conserve existing capitalization when required.
Going the extra mile
There's no such thing as unnecessary information when we are crafting our code implementations. Let's tackle a few more issues that might spring up when you least expect them.
Spaces, the final frontier
Dealing with a pesky non-breaking space?
By adding the non-breaking space character (\u00A0
) to our regex, we assure no space is left behind!
Performance on steroids
For folks who geek out over performance, it's important to remember that the replace()
method with regex might consume precious milliseconds on lengthy strings. Pre-processing with toLowerCase()
and then applying the pattern might add unnecessary bulk to your runtime.
Befriending language quirks
Don't let language-specific characters like ß
from German get lost in translation:
This regex pattern accommodates German umlauts ä
, ö
, ü
, and ß
, as an example of how to adjust for language-specific characters.
Additional tricks and thorny areas
Stone-cold strings
Don't forget that JavaScript strings are immutable. Any function claiming to modify your string is only handing you a shiny new string!
Regex and the cookie monster
When leveraging modern regex features, make sure your runtime environment is not stuck in the past and recognizes these latest updates.
In language, we trust
Working on an application that caters to a specific language or region? Consider using locale-specific methods like toLocaleUpperCase()
. These methods provide precise behaviour for specific locales.
Was this article helpful?