Explain Codes LogoExplain Codes Logo

How to use UTF-8 in resource properties with ResourceBundle

java
prompt-engineering
best-practices
utf-8
Nikita BarsukovbyNikita Barsukov·Jan 1, 2025
TLDR

To deal with UTF-8 in properties files, use a custom ResourceBundle.Control implementation. This custom class overrides the newBundle method, creating a PropertyResourceBundle through an InputStreamReader with UTF-8 charset.

Example:

public class UTF8Control extends ResourceBundle.Control { @Override public ResourceBundle newBundle(String baseName, Locale locale, String format, ClassLoader loader, boolean reload) throws IOException { String bundleName = toBundleName(baseName, locale); String resourceName = toResourceName(bundleName, "properties"); // Open the resource as a stream to avoid "loading it into a truck" try (InputStream is = loader.getResourceAsStream(resourceName); // UTF-8 encoding is like your favorite pair of jeans: always fits right Reader reader = new InputStreamReader(is, StandardCharsets.UTF_8)) { return new PropertyResourceBundle(reader); } } } // One small step for a developer, one giant leap for internationalization. ResourceBundle bundle = ResourceBundle.getBundle("BundleName", new UTF8Control());

The UTF8Control class ensures that ResourceBundle reads files using the UTF-8 encoding, solving any related encoding issues.

IDE-specific strategies for handling properties

IntelliJ IDEA encoding

In IntelliJ IDEA, set UTF-8 encoding for properties files. Go to File > Settings > Editor > File Encodings and set both Global Encoding and Project Encoding to UTF-8. IntelliJ assists by silently converting non-ASCII characters to \uXXXX format.

Workaround for Eclipse

Eclipse defaults to ISO-8859-1 encoding. This can be tackled by changing workspace settings to UTF-8 or converting your files to Unicode using native2ascii.

Universal solution: Text editors

When IDEs seem to cause more problems, resort to a generic text editor like Sublime Text or Visual Studio Code. They offer easy ways to work with and switch between encodings.

Achieving compatibility with older Java versions

The good news with Java 9 and newer

ResourceBundle in Java 9 and later versions supports UTF-8 by default. No more "lost in translation" issues with your property files!

The workaround for older Java versions

For Java 8 and earlier, either save files as ISO-8859-1 or use the UTF8Control class above. Another method is to use \uXXXX escape sequences for non-ISO characters.

The InputStreamReader magic trick

Suffering from mojibake? Combat garbled text by loading properties files with an InputStreamReader in UTF-8 charset.

Ensuring harmony with your environment

Google App Engine: Things to remember

Working with Google App Engine (GAE)? Ensure its environment constraints are respected when handling resource encodings. It's best not to upset the "App Engine gods".

IDE settings: Friend or foe?

Frequently, actual encoding hitches stem from wrongly configured IDE settings. Investigate your IDE setup and align it with your charset needs to avoid playing "blame the IDE".

Tools to the rescue: native2ascii

When UTF-8 encoding troubles persist, use native2ascii to convert properties files to ISO-8859-1. The conversion ensures you're not "lost in translation".

Effective techniques to master UTF-8 handling

Sophisticated property loading

Use InputStream and PropertyResourceBundle for a more sophisticated and flexible approach, especially for pre-Java 9 environments.

Quick encoding conversion trick

Facing an incorrectly read string? Utilize new String(bytes, "UTF-8") to turn ISO-8859-1 strings into UTF-8. It's like "ally in encoding-land".

Manual character inputs: A big no!

Manual input of non-ASCII characters? Think twice! Use escape sequences or ensure UTF-8 compatibility of the editor to avoid unexpected misfits.