In SQL, how can you "group by" in ranges?

sql

group-by

bucketing

sql-queries

byAnton Shumikhin·Feb 24, 2025

Perform range grouping using SQL CASE, creating data buckets:

SELECT
  CASE
    WHEN total_sale < 100 THEN '0 - 99'
    WHEN total_sale < 200 THEN '100 - 199'
    WHEN total_sale < 300 THEN '200 - 299'
    ELSE '300+'
  END AS sale_range,
  COUNT(*) AS order_count
FROM orders
GROUP BY sale_range;

This approach categorizes sales into defined ranges and counts the orders falling in these ranges.

Basics of Bucketing

Relying on a CASE statement in SQL permits the creation of defined bucket boundaries and aggregated data. This way, scores can be grouped by 10-interval ranges:

SELECT
  FLOOR(score/10)*10 AS score_range_start,
  COUNT(*) AS total_counts
FROM results
GROUP BY score_range_start
ORDER BY score_range_start;
-- Coffee time! All these scores have found their bucket home!

Molding Ranges to Your Needs

Real-world data is often diverse, requiring dynamic range groupings. With the CASE statement, handle the varying ranges and custom intervals efficiently:

SELECT
  CASE
    WHEN age < 20 THEN 'Under 20'
    WHEN age BETWEEN 20 AND 29 THEN '20s'
    WHEN age BETWEEN 30 AND 39 THEN '30s'
    WHEN age BETWEEN 40 AND 49 THEN '40s'
    WHEN age >= 50 THEN '50+' 
  END AS age_group,
  COUNT(*) AS count
FROM users
GROUP BY age_group;
-- Finally, age is not just a number anymore. It's a group!

Up Your Game with Advanced Groupings

For more complex scenarios, subqueries or Common Table Expressions (CTEs) can preprocess your data for efficient grouping. Imagine the power of defining your groupings before the actual count:

WITH ScoreRanges AS (
  SELECT
    id,
    CASE
      WHEN score < 50 THEN 'Failing'
      WHEN score < 70 THEN 'Passing'
      WHEN score >= 70 THEN 'Above Average'
    END AS ScoreGroup
  FROM StudentScores
)
SELECT ScoreGroup, COUNT(id) AS NumberOfStudents
FROM ScoreRanges
GROUP BY ScoreGroup;
-- CTEs be like: "A little prep won't hurt. Ready, Set, Group!"

Dealing with Dynamic Grouping

While static ranges are handy, sometimes you need dynamic bucketing. For instance, using NTILE distributes data into equal-sized buckets—a real lifesaver for percentile or quantile analyses:

SELECT NTILE(4) OVER (ORDER BY value) AS quartile, COUNT(*) AS count
FROM measurements
GROUP BY quartile;
-- SQL-QA: "How many buckets you need, boss?" You: "Just NTILE it!"

Winning Over Outliers

Outliers, our uninvited guests! When dealing with outliers or skewed data, don't rely on static ranges. Group using a statistical measure, like standard deviations:

SELECT
  CASE
    WHEN value < (AVG(value) - STDDEV(value)) THEN 'Below Average'
    WHEN value BETWEEN (AVG(value) - STDDEV(value)) AND (AVG(value) + STDDEV(value)) THEN 'Average'
    ELSE 'Above Average'
  END AS ValueRange,
  COUNT(*) AS Frequency
FROM data_points
GROUP BY ValueRange;
-- Outliers walking in... SQL: "Hold my query, watch this!"