Explain Codes LogoExplain Codes Logo

Read url to string in few lines Java code

java
try-with-resources
io-exceptions
java-8-streams
Alex KataevbyAlex Kataev·Aug 24, 2024
TLDR

Get a URL to a string in Java with java.nio's Files and Paths:

import java.nio.file.Files; import java.nio.file.Paths; import java.io.IOException; import java.net.URL; public class URLToString { public static String readURL(String urlString) throws IOException { // Straight to the point like a javelin throw return new String(Files.readAllBytes(Paths.get(new URL(urlString).toURI()))); } public static void main(String[] args) throws IOException { // URL content to string in no time. Holding my coffee still hot String content = readURL("http://example.com"); System.out.println(content); } }

This elegant piece of code uses NIO's filesystem operations to consume URL content directly into a string. Java developer's dream.

One-stop solution: Handling resources

The try-with-resources statement (Java 7's gift to developers) allows each resource to be closed after use. This is relevant to InputStream and other AutoCloseable resources:

import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.net.URL; import java.nio.charset.StandardCharsets; import java.util.stream.Collectors; public class URLReader { public static String readFromURL(String urlString) { try (InputStream stream = new URL(urlString).openStream()) { // Reads lines from the stream, joins with a new line akin to ants carrying a breadcrumb BufferedReader reader = new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8)); return reader.lines().collect(Collectors.joining("\n")); } catch (IOException e) { // Oops! Things didn't go well. Gracefully returning an empty string e.printStackTrace(); return ""; } } }

Your pick: Trade-offs and choices

java.nio.file.Files offer simplicity, but when you need more control over the process, URLConnection offers more levers. InputStream.readAllBytes() (a gem from Java 9's quiver) or tools like Apache Commons IO can be optimal choices.

The charset conundrum: Ensuring reliable decoding

Specifying a charset is key when reading a URL's contents. Leveraging StandardCharsets.UTF_8 ensures platform default charset differences won't spoil your party.

Going Jurassic Park: Handling exceptions

Expect the unexpected - or I/O exceptions in this case. Handle them gracefully. Use Scanner.hasNext() to avoid being caught off-guard by a NoSuchElementException.

Handling the titans: Large data strategy

When dealing with the Godzilla of datasets, BufferedReader or direct manipulation of the InputStream could offer better control over memory usage and performance.

When you're not alone: Using libraries

Consider using the Apache Commons IO library's IOUtils.toString() to unravel the URL content with less code. Don't forget to include the requisite Maven or Gradle dependency in your project.

import org.apache.commons.io.IOUtils; import java.net.URL; import java.nio.charset.StandardCharsets; public class URLContentReader { public static String readURLContent(String urlString) throws IOException { // Apache Commons IO to the rescue, faster than Flash! return IOUtils.toString(new URL(urlString), StandardCharsets.UTF_8); } }

Moving with times: Versatility through Java evolution

Java's evolution offers multiple avenues. Whether you prefer Java 8's streams, Java 11's HttpClient, or third-party libraries, there's always a suitable solution.