Explain Codes LogoExplain Codes Logo

Why is SELECT * considered harmful?

sql
best-practices
performance
database-design
Nikita BarsukovbyNikita BarsukovΒ·Sep 11, 2024
⚑TLDR

SELECT * is ill-advised as it leads to slower queries on wide tables and can cause failures if the schema changes. It's inefficient for networks and databases, especially when handling unused data. Therefore, always define the required columns for faster, extra-safe, and forward-compatible code.

Not recommended:

-- Hey, I've all the time in the world. Let's fetch everything! πŸ˜‰ SELECT * FROM users;

Recommended:

-- Stay lean, trim the fat, get only what's needed! 😎 SELECT id, username, email FROM users;

This method ensures minimum data transfer and preserves performance in case of any database structure changes.

Consequences of Being a SELECT * Fan

Frequent use of SELECT * may seem inconsequential, but its impacts can bring up the following issues:

  1. Performance Blues: Fetching all columns, whether necessary or not, can slow down query performance.
  2. Maintenance Nightmares: Dependencies on database structure go unnoticed, leading to future schema change shocks.
  3. Indexing Issues: SELECT * bypasses indexing strategies, forcing the database to break a sweat.
  4. Data Transmission Overload: What’s worse than unnecessary server workload and network latency? Nothing, precisely.
  5. Confusing Results: Binding issues pop up when columns have identical names across tables. Thanks, SELECT *.
  6. Future Surprises: Your code can hit the ground running... in the wrong direction if schema changes cause fails.

Stay Safe: Be Explicit About Your Columns

Good news, though, best practices come to the rescue:

  • Define Columns: Spell out only needed columns to enhance clarity, reduce data flow, and keep your application resilient against schema changes.
  • Stick to Aliases: Use table aliases and clear column names for more readable and maintainable code.
  • Modularity Matters: Develop your queries with an eye on modularity and strict typing for foolproof data integrity.
  • Deal with Updates: Defining columns equips your code to combat database changes, preventing unnecessary revision or potential bugs when new columns are added or existing ones removed.

Best Practices: Steer Clear of SELECT *

In complex databases, every decision counts:

  • Counting rows: An innocent SELECT COUNT(*) may crash into performance bottlenecks in specific databases.
  • Threading updates: New columns can inadvertently creep into your results, inviting unexpected code revisions.
  • Ad-hoc and debug: Save SELECT * for quick checks and debugging. Short-term performance trade-off? Acceptable.
  • Column order: Keep an eye out for arbitrary column appearances, which may break your program if it relies on column positions.