Explain Codes LogoExplain Codes Logo

Getting a File's MD5 Checksum in Java

java
md5
streams
exceptions
Nikita BarsukovbyNikita Barsukov·Aug 7, 2024
TLDR

Here's a quick and easy method to calculate an MD5 checksum in Java using MessageDigest:

import java.io.*; import java.security.*; public class ChecksumHelper { public static String getMD5Checksum(String path) throws NoSuchAlgorithmException, IOException { MessageDigest md = MessageDigest.getInstance("MD5"); try (InputStream is = Files.newInputStream(Paths.get(path)); DigestInputStream dis = new DigestInputStream(is, md)) { byte[] buffer = new byte[4096]; while (dis.read(buffer) != -1) { // Keep calm and keep on reading. } } // Turning bytes into beauty. return convertToHex(md.digest()); } private static String convertToHex(byte[] bytes) { StringBuilder sb = new StringBuilder(); for (byte b : bytes) { sb.append(String.format("%02x", b)); } return sb.toString(); } }

Call ChecksumHelper.getMD5Checksum with your file path and get your sweet MD5 sum back.

Safety first: dealing with streams and exceptions

Try-with-resources: friend of the streams

Working with any kind of streams? Remember try-with-resources is your pal, ensuring you don't leave streams open and prevent resource leaks. With DigestInputStream, you're reading, computing the checksum, and managing the stream – all in one go!

Handling those I/O exceptions

The method getMD5Checksum produces two exceptions - NoSuchAlgorithmException and IOException that you should catch in your higher-level methods and debug accordingly.

Memory-friendly: working with big ol’ files

Got a massive file on your hands? Consider your RAM and process your file in chunks. This will keep your memory consumption LOW, and prevent out of memory errors.

Exploring the neighbourhood: alternative libraries and methods

Apache Commons Codec: short and sweet

Need a one-liner? Ring the bell for DigestUtils.md5Hex from the Apache Commons Codec:

String md5Checksum = DigestUtils.md5Hex(new FileInputStream("file.txt"));

Google Guava: clean simplicity

Google Guava has a crisp solution as well, with the Files.hash() method:

HashCode md5Checksum = Files.asByteSource(new File("file.txt")).hash(Hashing.md5());

Beyond MD5: stepping into the secure zone

Using your checksum for cryptographic purposes? MD5 may not be your best bet. Check out more secure alternatives like SHA-256, bcrypt, or scrypt.