Explain Codes LogoExplain Codes Logo

Selecting COUNT(*) with DISTINCT

sql
distinct-counts
subqueries
null-values
Nikita BarsukovbyNikita Barsukov·Aug 16, 2024
TLDR

Eager to get count of unique entries? Use this:

SELECT COUNT(DISTINCT column_name) FROM table_name;

This gives you a count of unique column_name values in table_name, shoving duplicates aside.

Distinct Counts Avoiding Asterisks

Never use COUNT(DISTINCT *). Use DISTINCT with a specific column for meaningful results. When dealing with distinct program types, here's the go-to statement:

-- Goodbye duplicates, hello accuracy! SELECT program_type, COUNT(DISTINCT program_name) AS Count FROM cm_production WHERE push_number = @push_number GROUP BY program_type;

It filters by push_number, groups by program_type, and counts distinct program_name.

Clear Results with Aliases and Filters

For output clarity, use aliases like AS Count for distinct counts, AS [Type] for field clarity. Don't overcomplicate act of filtering data using a variable like @push_number. It's as easy as pie.

-- Variables declutter your queries, try it, they're not contagious! DECLARE @push_number INT = 1234; SELECT program_type AS [Type], COUNT(DISTINCT program_name) AS Count FROM cm_production WHERE push_number = @push_number GROUP BY program_type ORDER BY Count DESC, [Type];

Data is ordered by count then program type - prioritizing the meat and potatoes of the info.

Subqueries for Data Precision

When queries get complex, like counting distinct values in outer queries - subqueries or derived tables come to rescue:

-- Subqueries, because life's too short for manual counting! SELECT COUNT(*) FROM ( SELECT DISTINCT program_name FROM cm_production WHERE push_number = @push_number ) AS unique_programs;

Counts distinct program_name for a push_number like a boss - neatly packed in a subquery.

Squashing NULL Values

Use WHERE column_name IS NOT NULL to keep NULL values in check ensuring accurate counts. Just like in this piece of art:

SELECT COUNT(DISTINCT column_name) FROM table_name WHERE column_name IS NOT NULL;

This ensures that you're not counting phantom data. Ghostbusters would approve!

Adapting SQL Scripts

And when you'd want to count distinct regions and program types - a flexible SQL script is the way to go:

-- Concatenation, the secret sauce to mixed column distinct counts! SELECT COUNT(DISTINCT CONCAT(region, program_type)) AS UniqueRegionTypes FROM cm_production WHERE push_number = @push_number;

Here's to CONCAT for creating a unique identifier, and to your problem - goodbye!

Compatibility with Older SQL Versions

Working with SQL Server 2005? Some constructs are not supported. Replace CONCAT with the + operator for backward compatibility, and ISNULL for possible NULL values.