How can I make SQL case sensitive string comparison on MySQL?
Harness the power of the BINARY
operator for case sensitive comparisons in MySQL. This finely treats strings as binary data, distinguishing uppercase and lowercase:
String matches are strictly exact, meaning 'ExactCase' is different from 'exactcase'.
Explore case sensitivity in MySQL
The role of collations in case sensitive comparisons
The default collation in MySQL (latin1_swedish_ci
) is not case-sensitive. But you can use the COLLATE
operator with a binary collation such as latin1_bin
or utf8mb4_bin
for on-the-fly case sensitive comparisons:
Preserving index functionality with collate
Wrapping a column within CONVERT
and COLLATE
for a case comparison can compromise index usage. To preserve index functionality, apply these functions to the value, not the column. And when in doubt, deploy the EXPLAIN
to examine how an index is being utilized in a given query:
Table-wide case sensitivity: A word of caution
You can modify the table's collation to a binary collation to enforce inherent case sensitivity. But remember, this will apply to all queries on the string columns:
Tricky aspects with binary
Don't hastily use LIKE BINARY
or directly apply the BINARY
operator in WHERE
clauses on non-binary columns. It can force a full table scan or lead to charset mismatches, opening a Pandora's box of issues.
Case sensitivity hacks in MySQL
Prime-ordering your data
Use ORDER BY
alongside your case sensitive query to sort and prioritize your valuable data:
Partial matching but make it case-sensitive
When you want to partially match text but with respect to case, use COLLATE
and LIKE
:
SQL String Comparison: The Bouncers at 'The Data Club'
Case Insensitive (CI) Bouncer 🚶♂️: "Irrespective of 'JOHN', 'John', or 'joHn', you can join the party!"
Case Sensitive (CS) Bouncer 💂♂️: "'JOHN' you say? 'John' or 'joHn' need not apply!"
How strictly they check the ID is like the application of a BINARY comparison:
Compared to the lenient default comparison:
Strategizing your MySQL case sensitivity approach
Match your collations and charsets
Opt for utf8mb4
over utf8mb3
for complete UTF-8 support. When choosing a collation for case sensitivity, ensure it matches the charset for accurate comparisons.
When _ci
is not what you want
Be aware that collations with _ci
are both case and accent insensitive. You might want to steer clear of those if you're looking for strict case distinction.
Case of the mischievous charset mismatch
Beware of a potential charset mismatch when using binary
. Always pair the right charsets and collations. Remember, 'ä' != 'a'.
Was this article helpful?