Explain Codes LogoExplain Codes Logo

Select multiple columns from a table, but group by one

sql
join
groupby
subqueries
Anton ShumikhinbyAnton Shumikhin·Oct 24, 2024
TLDR

When you strive to select multiple columns but group by one, the aggregate functions like SUM() or MAX() are your heroes. For instance, let's consider a sales table with salesperson and sales_amount:

SELECT salesperson, SUM(sales_amount) AS total_sales FROM sales GROUP BY salesperson;

The column salesperson becomes the factor driving the groupings, and their individual sales are lumped together, or aggregated. Each column in your SELECT statement that's not part of the group should be included in an aggregate function to ensure a valid SQL grouping.

When non-uniques cause headaches 🤯

Non-unique columns can throw your result set into chaos. Let's see how the MAX() function comes to the rescue:

SELECT ProductID, MAX(ProductName) /*Most popular kid in town*/ AS ProductName, SUM(OrderQuantity) AS TotalOrdered FROM Products GROUP BY ProductID;

This way, the MAX() function allows you to include a non-grouped column like ProductName with your result set when you're grouping by ProductID.

Dealing with complex operators: CTEs and Subqueries

Complex grouping needs often mean going for Common Table Expressions (CTEs) and subqueries:

WITH GroupedData AS ( SELECT ProductID, SUM(OrderQuantity) AS TotalOrdered FROM Orders GROUP BY ProductID ) SELECT p.ProductID, p.ProductName, g.TotalOrdered FROM Products p JOIN GroupedData g ON p.ProductID = g.ProductID;

Breaking down complicated queries into manageable portions keeps your SQL query neat and your sanity intact.

No two alike with GROUP_CONCAT

If you need a list of unique values within a group, then GROUP_CONCAT, when coupled with DISTINCT, just might save your day:

SELECT ProductID, GROUP_CONCAT(DISTINCT SupplierID /* David Copperfield's disappearing act */ ORDER BY SupplierID) AS Suppliers FROM ProductSuppliers GROUP BY ProductID;

GROUP_CONCAT conjoins the SupplierIDs for every ProductID, and DISTINCT ensures that only unique values make the list.

Temporary sanctuaries

Temporary tables can be the oasis you seek in the dry desert of complex data manipulations:

CREATE TEMPORARY TABLE TempSales AS SELECT salesperson, SUM(sales_amount) AS total_sales /* "Sum" of all trades */ FROM sales GROUP BY salesperson; SELECT t.salesperson, t.total_sales FROM TempSales t;

Creating and utilizing a temporary table makes the handling of intricate queries much more manageable and well defined.

The VARCHAR savior

In join and group scenarios, VARCHAR types for textual data can be the guardian of consistency and precision:

SELECT s1.salesperson, s2.total_sales FROM Sales s1 JOIN ( SELECT salesperson, SUM(sales_amount) AS total_sales FROM Sales GROUP BY salesperson ) s2 ON s1.salesperson = s2.salesperson;

For eyes and minds: Formatted presentation

A sense of order can always help the mind work better, which is why ORDER BY is essential for a structured data presentation:

SELECT CategoryID, SUM(ProductSales) FROM ProductSalesData GROUP BY CategoryID ORDER BY SUM(ProductSales) DESC; /* All hail the sales king! */

Ordering the categories by the amount of their sales offers clear insights into the best performers.