Explain Codes LogoExplain Codes Logo

How to HTML encode/escape a string? Is there a built-in?

html
html-encoding
string-escaping
ruby-stdlib
Alex KataevbyAlex Kataev·Dec 1, 2024
TLDR

Here's a quick JavaScript function that uses the DOM's textContent property to perform HTML encoding:

function htmlEncode(str) { var div = document.createElement('div'); div.textContent = str; return div.innerHTML; } // Usage console.log(htmlEncode('<script>')); // &lt;script&gt;, try running this, we dare you!

This function turns characters like <, >, and & into HTML entities (&lt;, &gt;, &amp;), helping prevent XSS attacks when injecting text into the HTML.

Ruby: how to HTML-encode the developer way

In Ruby, there's a built-in class called CGI that comes with an escapeHTML() method, perfect for escaping those pesky HTML characters:

require 'cgi' encoded_string = CGI.escapeHTML("<script>alert('XSS!')</script>") puts encoded_string // prints "&lt;script&gt;alert('XSS!')&lt;/script&gt;", XSS-fears: 'Bye, Felicia!', HTML: 'Hello, world!'

These built-in methods are as reliable as Grandma's apple pie recipe – they get the job done right.

Rails: Escaping HTML or playing 'The Floor is Lava'

With Rails, things get even easier with the built-in h helper method. By default, Rails 3 and later versions automatically escape HTML. So, thank Rails for playing babysitter. If you need to switch off this nanny-mode for HTML strings, use the raw function:

<%= raw "<a href='http://example.com'>Link</a>" %> // "I solemnly swear that I am up to no good."

UTF-8: Conquering the Web at the speed of light

If UTF-8 compatibility keeps you up at night (we've all been there), then consider those sleepless nights over. The CGI::escapeHTML and h methods handle UTF-8 encoding like champs. After all, UTF-8 is like the cool kids table of web encoding these days.

Battle of the HTML Escapers: CGI vs ERB vs Rack

Comparing HTML 'escape artists', we see different methods at play in Ruby:

  • The ERB::Util.html_escape function is a method that escapes HTML quicker than Houdini at a pool party:
puts ERB::Util.html_escape("<script>alert('XSS!')</script>") // "abracadabra and all that jazz"
  • In yet another corner, we have Rack::Utils.escape_html. Rack has your back when it comes to HTML escaping:
require 'rack' puts Rack::Utils.escape_html("<script>alert('XSS!')</script>") // here's another rabbit from the hat

DIY escaping: The biter bit

Hold your horses, brave coder, before you embark on the treacherous journey of a custom (read: DIY) solution for HTML escaping. Built-in methods are like those 'stay at home' orders: they're there for a reason. Hence, let CGI, ERB, or Rack save your day with their built-in escape functions.