Explain Codes LogoExplain Codes Logo

How to use JavaScript regex over multiple lines?

javascript
regex
performance
best-practices
Alex KataevbyAlex Kataev·Feb 22, 2025
TLDR

The /s flag or [\s\S] pattern in JavaScript regex facilitates matching across multiple lines.

/s flag:

const regex = /pattern/s; // '.' matches any characters including newlines

[\s\S] (any char):

const regex = /pattern[\s\S]*/; // A hack for '.' to include newlines

Multiline matching basics

To match characters across multiple lines, either use the s flag or the [\s\S] pattern if the former is unsupported in your environment.

// Matches "Multi\nline\nstring!" across lines const multilineRegex = /^Multi[\s\S]*string!$/;

Controlling greediness in regex patterns

To avoid matching too much text (greediness), rely on *? or +?. These non-greedy quantifiers aim for the least possible matching characters.

// Matches the smallest possible string within <pre> tags const nonGreedyRegex = /<pre>[\s\S]+?<\/pre>/;

Performance in multiline regex

Performance is a critical aspect when crafting regex patterns for multiline use. Non-greedy quantifiers *? or +? may result in less efficiency. Thus, benchmarking becomes crucial, especially for large text bodies.

// Simple benchmark example const start = performance.now(); const match = largeText.match(regexPattern); // Could be as fast as Flash or as slow as a turtle! const end = performance.now(); console.log(`Matching took ${end - start} milliseconds`);

Pitfalls and how to avoid them

Using . or (.|[\r\n]) can result in slower execution times. Therefore, it's preferable to use [\s\S] or the s flag for matching across multiple lines.

DOM methods over regex

Need to extract <pre> elements in the DOM? Use DOM methods or a library like jQuery for a more efficient and reliable approach rather than regex.

// Using standard DOM methods const preElements = document.getElementsByTagName("pre"); // Using jQuery for simpler syntax const preElements = $('pre');

Handling line endings in regex

Adding \r in your pattern will match all types of line endings (Unix: \n, Windows: \r\n, classic Mac: \r). It acts as a compatibility sledgehammer across platforms.

Evolution of multiline regex matching

The introduction of the s flag in ECMAScript 2018 changed how multiline matching works in JavaScript, making multiline matching as simple as pie.

Parsing HTML with regex

It's important to remember that parsing or manipulating HTML with regex is generally discouraged. Use DOM parsing methods for handling HTML's complex nested structures.

Wrapping up multiline matching

If you need to capture content in <pre> tags across multiple lines, an amalgamation of non-greedy quantifiers, s flag, and global search does the trick:

const preContentRegex = /<pre>.*?<\/pre>/gs; // Captures each <pre> content block

Backward compatibility and performance considerations

Even though methods like [^] are deprecated, they continue to be useful for scenarios requiring backward compatibility. And, (.|\n) should be used sparingly considering performance implications.

Alternatives to regex in JavaScript

Remember, regex is not the only solution when dealing with string patterns. JavaScript provides other handy tools for pattern matching, like .split(), .indexOf(), .includes(), as well as array methods like .filter(), .map(), .reduce().