How to count string occurrence in string?

javascript

regex

performance

best-practices

byNikita Barsukov·Dec 24, 2024

const countOccurrences = (text, search) => (text.match(new RegExp(search, 'g')) || []).length;
console.log(countOccurrences('apple banana apple', 'apple'));  // Output: 2

Easily count search occurrences in text employing countOccurrences. By using RegExp with the 'g' flag the function executes a global search pattern. When nothing is found, it returns 0, thanks to the || operator; otherwise, array length from .match() signifies the count.

Breaking down the match method and regex flags

The JavaScript .match() method, when teamed with regular expressions, is a power-tool to find pattern iterations in text data. The search is initially case-sensitive, which helps keep an accurate count, a must-have when dealing with case-significant data.

However, keep in mind if .match() finds no matches, it returns a null. To avoid potential errors when trying to return the length of null, we resort to the logical || operator leading to an end result of 0.

const caseSensitiveCount = (text, search) => {
  // It's a match! Or it's a null... 
  const matches = text.match(new RegExp(search, 'g'));
  return matches ? matches.length : 0;
};

Alternative: counting using split

If you prefer steering clear from .match(), consider the .split() method. Here, you're breaking down the original string into an array at each instance of the substring. Subtract one from the length of this array, and you get the occurrence count.

const countWithSplit = (text, search) => {
  // Divide and conquer! Unless the divisor is an empty string, then we've got problems...
  if (!search.length) return text.length ? text.length - 1 : 0;
  return text.split(search).length - 1;
};

Please note, .split() may not behave as you expect when search is an empty string. Instead of splitting at every character, it gives you an array of the entire string, leading to a potential off-by-one error.

Handling overlapping substrings

Traditional regex searching runs into trouble when dealing with overlapping substrings - no match is currently being made at the overlapping point. To solve it, we have rolled up our sleeves and manually iterate over the text.

const countOverlapping = (text, search) => {
  // They see me loopin', they waitin'...
  // To avoid infinite loops, we're stepping over the text like we just don't care
  let count = 0, position = 0, step = search.length > 0 ? search.length : 1;
  while ((position = text.indexOf(search, position)) !== -1) {
    count++;
    position += step; 
  }
  return count;
};

The function shown here even handles zero-length search strings, dutifully avoiding infinite loops and ensuring correctness all the way.

Keep an eye on performance

Saving the search string's length can work wonders for performances. Especially when dealing with monstrous amounts of text, holding onto this number avoids length recalculations on every loop.

Performance tests indicate that some methods outpace regex-based matching:

Leveraging .indexOf() in a loop for non-regex patterns is a winner.
Caching lengths for text and search can run circles around uncached versions.
Using the split() method sparingly - overuse could lead to performance bumps on large strings.