In SQL, what's the difference between count(column) and count(*)?
Within the scope of SQL, COUNT(column)
notches up non-null entries of a specified column, whereas COUNT(*)
tallies all rows, with negligent regard to duplicates and nulls.
COUNT(*)
welcomes every record to the counting party, including duplicates and aimless null-containing rows, while COUNT(column)
tactfully leaves out null values.
Precision vs Generality: Choose Wisely
Chasing precision with count(column)
COUNT(column)
curates exclusive counts of non-null column values, avoiding the extra sprinkling of NULL
values. It comes in handy when:
- You are playing detective, identifying non-NULL duplicates:
SELECT column_name FROM table WHERE COUNT(column_name) > 1
- Counting unique non-NULL values only:
SELECT COUNT(DISTINCT column_name) FROM table
Pursuing speed with count(*)
COUNT(*)
leaps over individual row data, making use of table indexes for an impressive speed boost. It's the go-to when:
- Determining the dataset size
- Conducting data integrity checks (Is everything still intact after that data import?)
Wrestling with NULL
If you have a users
table where the email
column may contain NULL
values:
COUNT(email)
delivers the count of users who provided an email
, while COUNT(*)
sweeps up all users. No email? No problem for COUNT(*)
.
Counting without hiccups
Swapping COUNT(column_name)
with COUNT(*)
in a query designed to filter duplicates or specific criteria can play tricks on your query logic, nudging you towards misleading results.
Nifty Tricks & Treats in Count-Land
Vanishing act with empty datasets
Performing a count on an empty set? Watch the null vanish:
Both return 0
, as the WHERE
clause cleverly filters out all partygoers.
Counting amidst Change
Transactional databases could be writing (INSERT) or erasing (DELETE) records as you count. Here, COUNT(*)
snaps a headcount at that moment, whereas COUNT(column)
may miss the latecomers and early leavers.
Index-induced Speed
Lacking a proper index, COUNT(column_name)
may get a tad sluggish with full table scans, while COUNT(*)
can zip through using optimized shortcuts through indexes.
When Subtleties are Paramount
Group Therapy with count
The GROUP BY
clause often teams with COUNT
, crunching counts for each group:
- With
COUNT(column)
, groups sporting all nulls find themselves outside the count. COUNT(*)
, on the other hand, accurately reflects the group sizes.
Paying heed to Constraints
When columns are shackled with NOT NULL constraints, COUNT(column)
and COUNT(*)
yield identical counts. From here, the choice hinges on your situation or coding style preferences.
Dealing with Joins
LEFT JOIN
s might insert NULL
values into rows, skewing counts if COUNT(column)
is beckoned from the nullable side:
Here, COUNT(order_items.id)
count only orders that have been fulfilled, while COUNT(*)
embraces all orders, completed or otherwise.
Was this article helpful?