Using group by on two fields and count in SQL

sql

best-practices

group-by

count

byAlex Kataev·Nov 30, 2024

For a quick solution, GROUP BY with two columns and utilize COUNT() to note the occurrences:

SELECT Col1, Col2, COUNT(*) AS Total
FROM Table
GROUP BY Col1, Col2;

If your columns share names with SQL keywords or functions, employ aliases for unique identifiers:

SELECT myGroup AS GroupName, mySubGroup AS SubGroupName, COUNT(*) AS GroupCount
FROM myTable
GROUP BY myGroup, mySubGroup;

Test your SQL queries in their specific database environments for accurate counts. Be wary of case sensitivity, and avoid using reserved SQL keywords for column names to bypass naming conflicts.

GROUP BY and COUNT: Best practices and potential pitfalls

Keyword and case sensitivity issues

Be mindful of your column names in SQL —keywords or case-sensitive names can generate errors:

Aliases offer a workaround for keyword usage in column names.
Running a trial of your query in the intended database can help spot issues with case sensitivity.

Demystifying COUNT()

COUNT() can be a tricky function, but these pointers may help:

COUNT(*) counts all rows, including those with NULL values.
To count only non-NULL values in a column, use COUNT(column_name).
To maintain consistency, ensure the columns in your SELECT statement match those used in GROUP BY.

Mastering GROUP BY: Optimization and techniques

Unlocking the power of subqueries

Subqueries can bring considerable power to complex aggregations:

-- Subqueries: befriending aggregation one layer at a time!
SELECT friend_pair, MAX(times_met)
FROM (
    SELECT concat(friend1, ' & ', friend2) AS friend_pair, COUNT(*) AS times_met
    FROM photo_album
    GROUP BY friend1, friend2
) AS subquery
GROUP BY friend_pair;

Scalability with indexes

Creating indexes on GROUP BY columns can swift up performance:

Place the index order in line with the GROUP BY order.
Use composite indexes for multiple GROUP BY columns.

Advanced techniques: OLAP functions

Modern SQL databases support OLAP functions that allow for an additional layer of analysis:

-- OLAP functions: because SQL has its ups and downs!
SELECT 
  Col1, 
  Col2,
  COUNT(*) OVER(PARTITION BY Col1, Col2) AS RunningTotal
FROM 
  Table;