Explain Codes LogoExplain Codes Logo

Detect URLs in text with JavaScript

javascript
prompt-engineering
functions
callbacks
Nikita BarsukovbyNikita Barsukov·Jan 23, 2025
TLDR
const urls = text.match(/https?:\/\/\S+/g) || [];

Only two lines are required to detect URLs in javascript. A regex inside of text.match() directly extracts URLs from the text string. The \S+ ensures it matches all non-whitespace characters after http:// or https://, stopping at the first space encountered. If it finds URLs, it will return an array; otherwise, an empty array is given back.

Working with diverse URL types

URLs come in different formats and not all begin with http. In fact, protocols such as ftp, file, and others are quite common in URL schemes and should be accounted for while detecting URLs.

Recognizing a wider range of URL patterns

Trailing punctuation, considered part of the URL by naïve regex, should be handled correctly. Use the following pattern:

// "Does your URL driver's license cover ftp and file protocols too?" - Policeman Regex const urlPattern = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#/%=~_|$?!:,.]*)/gi; const urls = text.match(urlPattern) || [];

This regex handles different protocols, addresses trailing punctuation, and doesn't mistake them as part of the URL.

Handling false positives in URL detection

Our good old regex can sometimes bite off more than it can chew, causing false positives. Perfect detection is possible with machine learning, but since that's like using a chainsaw to cut a stick of butter, utilize existing tools such as LinkifyJS or use tested, open source patterns from Android's Linkify.

If your goal is to turn detected URLs into clickable links, a linkify function is what you need:

// "Prepare for linkification!" - Captain Regex function linkify(inputText) { var replacedText, replacePattern; replacePattern = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#/%=~_|$?!:,.]*)/gi; replacedText = inputText.replace(replacePattern, '<a href="$1" target="_blank">$1</a>'); replacePattern = /(^|[^/])(www\.[\S]+(\b|$))/gim; replacedText = replacedText.replace(replacePattern, '$1<a href="http://$2" target="_blank">$2</a>'); return replacedText; }

Shoutout to the replace method! It substitutes URLs with HTML anchor tags. The target "_blank" ensures the links open in a new browser tab, avoiding any steering-away-from-content fiascos.

Auto-conversion in selected elements

Let your program scan and convert text within chosen DOM elements:

function autoLinkify(selector) { document.querySelectorAll(selector).forEach(element => { element.innerHTML = linkify(element.textContent); }); }

Advanced URL matching for the brave

For those brave souls who desire extra precision and ambition, lookup RFC 1738 and the updated top-level domain lists. These tools will help ensure nothing slips through your pattern's grasp, not even the newest URL schemas. Winning!

Pro-tips for integration

  • Harness this power with a CMS for dynamic text.
  • Add an email address detection, taking inspiration from Android's Linkify. Multitasking, yay!
  • Don't forget to sanitize user input. Stay XSS safe, my friends.

References