Explain Codes LogoExplain Codes Logo

Count(*) vs count(column-name) - which is more correct?

sql
performance
best-practices
query-optimization
Nikita BarsukovbyNikita Barsukov·Dec 31, 2024
TLDR

Choose COUNT(*) for a comprehensive count of all rows, irrespective of NULL values. Use COUNT(column-name) when you want a tally of non-NULL values in a specific column.

A pinch of SQL for you:

SELECT COUNT(*) FROM your_table; -- Total rows (All the potatoes in the sack) SELECT COUNT(your_column) FROM your_table; -- Non-NULL in 'your_column' (Just the good potatoes)

Decision Time: COUNT(*) for overall count, COUNT(column-name) for counting non-NULL in a column.

If performance racing is your thing, COUNT(*) usually wins, especially with no WHERE clause on MyISAM storage engine as it gets to avoid the mud (table data). Meanwhile, COUNT(column-name) preps itself with indexed columns, though the mileage varies based on the engine's horsepower (database engine) and its tires (indexing strategy).

Knowing your racecourse (Performance implications)

Taking the Index Lane

Indexes can turbocharge your COUNT operations. Here's how:

  • Non-indexed columns: COUNT(column-name) could take the scenic route (full scan) to check for NULL instances, affecting your speed time (performance).
  • Indexed columns: If the column has index turbos, COUNT(column-name) could just use these to blitz across the finish-line, potentially at the speed of COUNT(*).

MyISAM vs InnoDB

  • MyISAM: COUNT(*) is super-fast, as if it had a secret teleportation device (accesses stored row count directly).
  • InnoDB: This engine forgot its teleportation device at home, so COUNT(*) may rank up similar times to COUNT(column-name).

Handling Unique Turns

To count unique non-NULL values, trust your GPS saying COUNT(DISTINCT column-name), which you can supercharge with indexing.

Stuff I Wish I Knew Before (Special Considerations)

Accuracy vs Speed

Though COUNT(*) gets the fast lane, remember that accurate count (correctness) matters too. If your table has a non-NULL identifier column, either method would work without much counting academic discrepancies (see, math is everywhere!). Consider COUNT(column-name) as your best bud when NULL values do matter.

Context Matters

Adopt a situational approach (think "SQL detective") when choosing between COUNT(*) and COUNT(column-name). Comprehend your query's purpose, expected result, and the table formation to make a Sherlock-level decision, ensuring the mystery of both correctness and efficiency is solved.

Querying Wisely

A secret ingredient to your SQL stew is optimized queries. Look beyond the COUNT(*) vs COUNT(column-name) battle and contemplate the structure of your entire query for performance gains.