Explain Codes LogoExplain Codes Logo

Count the occurrences of DISTINCT values

sql
distinct
count
group-by
Anton ShumikhinbyAnton Shumikhin·Sep 12, 2024
TLDR

Need to count unique entries in your SQL data? Just pile on COUNT() and GROUP BY in one concise command:

SELECT column, COUNT(*) AS count FROM table GROUP BY column;

Just like that, each distinct column value in table is paired with its corresponding count. Ready for more? Let's dive in!

DISTINCT: The spook in your SQL closet

SQL can sometimes be a circus with recurring acts, but DISTINCT steps in to chase away the redundant clowns. Use it within COUNT(*) to count only unique occurrences, cutting down on unwanted repetition:

SELECT column, COUNT(DISTINCT column) AS count FROM table GROUP BY column ORDER BY count DESC;

Simply put, with DISTINCT, your data carnival gets less chaotic, presenting unique acts and their popularity in descending order.

The Troy Horse: COUNT(1)

Though not an all-time favorite, COUNT(1) rides in as a dark horse, offering performance perks when dealing with large data. Should you bet on this unlikely candidate? Check it out:

SELECT column, COUNT(1) AS count FROM table GROUP BY column;

// Like pineapple on pizza, it's an acquired taste. But you never know until you try!

Still not convinced? How about this: just as a Trojan horse fooled Troy, the COUNT(1) keeps your database optimizer guessing, often resulting in quicker results.

Clarity in the face of redundancy

When it comes to expressing data patterns clearly, GROUP BY and DISTINCT dress up like superheroes, eliminating duplicates to represent each unique name correctly:

SELECT column, COUNT(DISTINCT other_column) AS count FROM table GROUP BY column;

// It's like they left their cloak of invisibility at home. No more hiding, all distinct entries in spotlight!

Practical scenarios, problems, and pepperonis

NULL values: How to count ghosts?

COUNT() treats NULL values like they don't exist! To include these ghostly occurrences, use the conjuring power of CASE within COUNT():

SELECT column, COUNT(CASE WHEN other_column IS NOT NULL THEN 1 END) AS count FROM table GROUP BY column;

// Because not all heroes wear capes. Some NULLs deserve recognition too!

Tricky data types: Beware of shady characters

VARCHAR or other non-standard data types can be as tricky to handle as wet soap. Look out for whitespaces or case sensitivity that might disrupt the uniformity of your data.

Join operations: Tying the knot without tripping

Doing joins? Ensure your DISTINCT is well-placed; otherwise, you might end up with misleading results:

SELECT a.column, COUNT(DISTINCT b.other_column) AS count FROM a JOIN b ON a.id = b.foreign_id GROUP BY a.column;

// Beware! Any misplaced step while doing the SQL tango with joins can lead to inflated counts. Tango is a two, not a three!

In other words, handle joins carefully. They can inflate row counts, but a correctly placed DISTINCT inside a COUNT() ensures a graceful pasodoble with your data.