Explain Codes LogoExplain Codes Logo

How to escape text for regular expression in Java?

java
regex-escaping
pattern-quote
regular-expression
Nikita BarsukovbyNikita Barsukov·Jan 21, 2025
TLDR

To escape text for regex in Java, use Pattern.quote(). This ensures all metacharacters are interpreted as literals.

Example:

String escapedText = Pattern.quote("1+1=2?"); // Escapes the '+' and '?' to avoid existential crisis

By wrapping any input string with Pattern.quote(), all special regex symbols get interpreted as they are, allowing for an exact match to user input, even when it includes special characters.

Deep-dive into regex escaping

The essence of Pattern.quote()

Pattern.quote() functions like an "invisibility cloak" for special characters in a string when used with Java's regex engine. It embeds the string between \Q and \E, rendering any special characters within it as simple, ordinary, un-special characters.

String userInput = "$5"; String escapedText = Pattern.quote(userInput); // Becomes \Q$5\E. Dollar is just a dollar now!

Other options for regex safety

As well as Pattern.quote(), you can give the Pattern.LITERAL flag a whirl, which forces the entire pattern to be interpreted as a string of literal characters.

For more dynamic pattern generation, you could include \Q...\E directly within your regex string.

Taking care of replacement strings

When dealing with replacement strings in methods like String.replaceAll(), seek solace in Matcher.quoteReplacement(). This method ensures the $ and \ characters - which are special in replacement strings - are safely escaped.

Mastering your regex-scaping skills

Escaping and internationalization

In the realm of internationalization (i18n) - and particularly when utilising Spring Framework - Matcher.quoteReplacement() is essential due to its role as the default method for escaping. It's like a universal translator for your dynamic text.

Escaping and XML/HTML

When working with XML or HTML tags, which often play the role of placeholders, careful text escaping is key, to yield accurate translations and stave off potential issues like cross-site scripting (XSS) attacks.

Real-world uses of regex escaping

In daily programming tasks, such as parsing a CSV file with a regex special character as a delimiter, or sanitizing user inputs in a search function, escaping becomes crucial. Fail to do so, and you risk going down a rabbit hole of erratic patterns and unintentional vulnerabilities.