Explain Codes LogoExplain Codes Logo

How can I make SQL case sensitive string comparison on MySQL?

sql
case-sensitive-comparison
mysql-collations
sql-string-comparison
Anton ShumikhinbyAnton Shumikhin·Aug 22, 2024
TLDR

Harness the power of the BINARY operator for case sensitive comparisons in MySQL. This finely treats strings as binary data, distinguishing uppercase and lowercase:

SELECT * FROM table_name WHERE BINARY column_name = 'ExactCase';

String matches are strictly exact, meaning 'ExactCase' is different from 'exactcase'.

Explore case sensitivity in MySQL

The role of collations in case sensitive comparisons

The default collation in MySQL (latin1_swedish_ci) is not case-sensitive. But you can use the COLLATE operator with a binary collation such as latin1_bin or utf8mb4_bin for on-the-fly case sensitive comparisons:

SELECT * FROM table_name WHERE column_name COLLATE utf8mb4_bin = 'ExactCase';

Preserving index functionality with collate

Wrapping a column within CONVERT and COLLATE for a case comparison can compromise index usage. To preserve index functionality, apply these functions to the value, not the column. And when in doubt, deploy the EXPLAIN to examine how an index is being utilized in a given query:

EXPLAIN SELECT * FROM table_name WHERE column_name COLLATE utf8mb4_bin = 'ExactCase';

Table-wide case sensitivity: A word of caution

You can modify the table's collation to a binary collation to enforce inherent case sensitivity. But remember, this will apply to all queries on the string columns:

ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

Tricky aspects with binary

Don't hastily use LIKE BINARY or directly apply the BINARY operator in WHERE clauses on non-binary columns. It can force a full table scan or lead to charset mismatches, opening a Pandora's box of issues.

Case sensitivity hacks in MySQL

Prime-ordering your data

Use ORDER BY alongside your case sensitive query to sort and prioritize your valuable data:

SELECT * FROM table_name WHERE BINARY column_name = 'ExactCase' ORDER BY valuable_column DESC; -- pizza is on me if your column is actually named valuable_column 🍕

Partial matching but make it case-sensitive

When you want to partially match text but with respect to case, use COLLATE and LIKE:

SELECT * FROM table_name WHERE column_name LIKE 'Exa%' COLLATE utf8mb4_bin;

SQL String Comparison: The Bouncers at 'The Data Club'

Case Insensitive (CI) Bouncer 🚶‍♂️: "Irrespective of 'JOHN', 'John', or 'joHn', you can join the party!"

Case Sensitive (CS) Bouncer 💂‍♂️: "'JOHN' you say? 'John' or 'joHn' need not apply!"

How strictly they check the ID is like the application of a BINARY comparison:

-- CS Bouncer is that guy who will object to the tiniest typo on your ID SELECT * FROM `users` WHERE BINARY `username` = 'JOHN';

Compared to the lenient default comparison:

-- Got the letters somewhat right? Come on in! SELECT * FROM `users` WHERE `username` = 'JOHN';

Strategizing your MySQL case sensitivity approach

Match your collations and charsets

Opt for utf8mb4 over utf8mb3 for complete UTF-8 support. When choosing a collation for case sensitivity, ensure it matches the charset for accurate comparisons.

ALTER TABLE table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;

When _ci is not what you want

Be aware that collations with _ci are both case and accent insensitive. You might want to steer clear of those if you're looking for strict case distinction.

Case of the mischievous charset mismatch

Beware of a potential charset mismatch when using binary. Always pair the right charsets and collations. Remember, 'ä' != 'a'.

-- Just when you thought all 'a' were created equal! SELECT 'ä' LIKE BINARY 'a' AS Unexpected;