Explain Codes LogoExplain Codes Logo

Html Entity Decode

html
html-entity-decode
cross-site-scripting
security
Nikita BarsukovbyNikita Barsukov·Dec 19, 2024
TLDR

Here's the JavaScript shortcut to decode HTML entities efficiently. Create an element and use its textContent:

function htmlDecode(html) { var elem = document.createElement('textarea'); // Fancy a textarea? elem.innerHTML = html; // Fed the HTML to our textarea gourmet! return elem.textContent; // Out comes the tastefully decoded entities! } console.log(htmlDecode('The &amp; is &lt;strong&gt;')); // "The & is <strong>"

This easy trick transforms entities like &amp;, &lt;, and &quot; into &, <, and ", without any extra library dependencies. Talk about travelling light!

Optimized decoding: Less is more!

Let's cut overheads. In an environment without jQuery, the above method is a streamlined solution for decoding HTML entities. It efficiently utilizes browser-native mechanisms.

Safety first: Preventing Cross-Site Scripting (XSS)

Security wise, use .textContent over .innerHTML to ensure our entity transformation doesn't invoke any unwanted scripts. Prevention better than cure, eh?

Special character handling: An expert juggler!

Dealing with special HTML characters while decoding? Our htmlDecode function masterfully handles these characters, maintaining the decoded text's integrity. Abracadabra!

Reusability: Rinse & Repeat

For better reusability, wrap your decoding logic like we do in the function. The referenced JSFiddle links are great for testing to ensure correctness.

Supercharging jQuery: A plugin's perks

For jQuery fans, convert the htmlDecode function into a jQuery plugin for ease of access:

$.fn.htmlDecode = function() { return $("<textarea/>").html(this.html()).text(); // Give a text, Take a text! };

Presto! Decode HTML entities across your projects with $(element).htmlDecode(). jQuery to the rescue!

HTML entity filtering: Safety goggles ON!

Mike Samuel's advice: Always sanitize inputs by filtering out HTML tags. Here's how:

function sanitize(html) { var div = document.createElement('div'); div.textContent = html; // No more monkey business, HTML! return div.innerHTML; }

This method turns any HTML tags into harmless plain text.

Regex to the rescue: Match and Replace

How about regex? Team it with a pre-set array of entities for replacing, avoiding any DOM object creation:

function regexDecode(html) { var entities = {'&amp;': '&', '&lt;': '<', '&gt;': '>', '&quot;': '"'}; return html.replace(/&amp;|&lt;|&gt;|&quot;/g, function(match) { return entities[match]; // Swift swap! }); }

Tailor this approach as required with more entities, and keep it DOM-element free.