Explain Codes LogoExplain Codes Logo

Case insensitive XPath contains() possible?

xpath
xpath-functions
javascript
dom-manipulation
Anton ShumikhinbyAnton Shumikhin·Dec 29, 2024
TLDR

For case-insensitive searches in XPath, utilize the translate() function as shown below:

//*[translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'term')]

Replace 'term' with your query in lowercase, granting a case-insensitive contains() lookup.

Deep-dive into the translate() function

The translate() function provides a powerful method in XPath 1.0 for manipulating strings. Leveraging this method, we can normalize for case variations.

// Roughly translating for non-programmers: I see no cases, only characters translate(., 'ABCDE...Z', 'abcde...z')

This brings us one step closer to case-less world of XPath.

Possible limitations and getting around them

Although translate() is quite versatile, it shows weakness when dealing with special characters or single quotes. Escape sequences are then needed to tackle them, which may lead to more complex expressions.

XPath 2.0 introduces more functions such as lower-case() and matches() that make our life easier, the latter even supports regular expressions and case-insensitive searches using the 'i' flag.

Leveraging JavaScript to build dynamic XPath expressions

To build dynamic XPath expressions, especially for DOM manipulation via JavaScript, having functions to construct such expressions can be a lifesaver. This, of course, assumes a solid understanding of both XPath and JavaScript.

// Time to bring JavaScript into play dynamicXPath = (term) => `//*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), '${term}')`;

Practical examples and caveats

Here's how you can map a full alphabet for a case-insensitive search:

//*[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'example')]

Replace 'example' with your search term, ensuring all instances irrespective of case are considered. Beware DINOSAURS are not allowed, no matter upper case or lower case.