In SQL, what's the difference between count(column) and count(*)?
Within the scope of SQL, COUNT(column) notches up non-null entries of a specified column, whereas COUNT(*) tallies all rows, with negligent regard to duplicates and nulls.
COUNT(*) welcomes every record to the counting party, including duplicates and aimless null-containing rows, while COUNT(column) tactfully leaves out null values.
Precision vs Generality: Choose Wisely
Chasing precision with count(column)
COUNT(column) curates exclusive counts of non-null column values, avoiding the extra sprinkling of NULL values. It comes in handy when:
- You are playing detective, identifying non-NULL duplicates:
SELECT column_name FROM table WHERE COUNT(column_name) > 1 - Counting unique non-NULL values only:
SELECT COUNT(DISTINCT column_name) FROM table
Pursuing speed with count(*)
COUNT(*) leaps over individual row data, making use of table indexes for an impressive speed boost. It's the go-to when:
- Determining the dataset size
- Conducting data integrity checks (Is everything still intact after that data import?)
Wrestling with NULL
If you have a users table where the email column may contain NULL values:
COUNT(email) delivers the count of users who provided an email, while COUNT(*) sweeps up all users. No email? No problem for COUNT(*).
Counting without hiccups
Swapping COUNT(column_name) with COUNT(*) in a query designed to filter duplicates or specific criteria can play tricks on your query logic, nudging you towards misleading results.
Nifty Tricks & Treats in Count-Land
Vanishing act with empty datasets
Performing a count on an empty set? Watch the null vanish:
Both return 0, as the WHERE clause cleverly filters out all partygoers.
Counting amidst Change
Transactional databases could be writing (INSERT) or erasing (DELETE) records as you count. Here, COUNT(*) snaps a headcount at that moment, whereas COUNT(column) may miss the latecomers and early leavers.
Index-induced Speed
Lacking a proper index, COUNT(column_name) may get a tad sluggish with full table scans, while COUNT(*) can zip through using optimized shortcuts through indexes.
When Subtleties are Paramount
Group Therapy with count
The GROUP BY clause often teams with COUNT, crunching counts for each group:
- With
COUNT(column), groups sporting all nulls find themselves outside the count. COUNT(*), on the other hand, accurately reflects the group sizes.
Paying heed to Constraints
When columns are shackled with NOT NULL constraints, COUNT(column) and COUNT(*) yield identical counts. From here, the choice hinges on your situation or coding style preferences.
Dealing with Joins
LEFT JOINs might insert NULL values into rows, skewing counts if COUNT(column) is beckoned from the nullable side:
Here, COUNT(order_items.id) count only orders that have been fulfilled, while COUNT(*) embraces all orders, completed or otherwise.
Was this article helpful?