Table of Contents
Introduction to Aggregate Functions
Aggregate functions perform calculations on multiple rows and return a single result. They're essential for data analysis and reporting.
Common Aggregate Functions
COUNT() - Counting Records
-- Count all customers
SELECT COUNT(*) AS total_customers FROM customers;
-- Count customers with phone numbers
SELECT COUNT(phone) AS customers_with_phone FROM customers;
-- Count unique cities
SELECT COUNT(DISTINCT city) AS unique_cities FROM customers;
SUM() - Adding Values
-- Total sales amount
SELECT SUM(total_amount) AS total_sales FROM orders;
-- Sum of product prices in category 1
SELECT SUM(price) AS category_total FROM products WHERE category_id = 1;
AVG() - Average Values
-- Average order amount
SELECT AVG(total_amount) AS average_order FROM orders;
-- Average product price
SELECT AVG(price) AS average_price FROM products;
MIN() and MAX() - Minimum and Maximum
-- Cheapest and most expensive products
SELECT
MIN(price) AS cheapest_price,
MAX(price) AS most_expensive_price
FROM products;
-- Earliest and latest order dates
SELECT
MIN(order_date) AS first_order,
MAX(order_date) AS latest_order
FROM orders;
GROUP BY Clause
GROUP BY groups rows with the same values and allows aggregate functions to be applied to each group.
Basic GROUP BY
-- Count customers by city
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city;
Multiple Columns in GROUP BY
-- Count customers by city and state
SELECT city, state, COUNT(*) AS customer_count
FROM customers
GROUP BY city, state
ORDER BY state, city;
HAVING Clause
HAVING filters groups after GROUP BY is applied (WHERE filters before grouping).
-- Cities with more than 5 customers
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city
HAVING COUNT(*) > 5;
Practical Examples
Sales Analysis
-- Monthly sales summary
SELECT
YEAR(order_date) AS year,
MONTH(order_date) AS month,
COUNT(*) AS total_orders,
SUM(total_amount) AS total_sales,
AVG(total_amount) AS average_order_value
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY year, month;
Customer Analysis
-- Customer order statistics
SELECT
c.customer_id,
c.first_name,
c.last_name,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent,
AVG(o.total_amount) AS average_order_value,
MAX(o.order_date) AS last_order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC;
Product Performance
-- Best selling products
SELECT
p.product_name,
SUM(oi.quantity) AS total_sold,
SUM(oi.quantity * oi.unit_price) AS total_revenue
FROM products p
JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_revenue DESC
LIMIT 10;
Advanced Grouping Techniques
Conditional Aggregation
-- Count orders by status
SELECT
COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending_orders,
COUNT(CASE WHEN status = 'processing' THEN 1 END) AS processing_orders,
COUNT(CASE WHEN status = 'shipped' THEN 1 END) AS shipped_orders,
COUNT(CASE WHEN status = 'delivered' THEN 1 END) AS delivered_orders
FROM orders;
Percentage Calculations
-- Customer distribution by city (with percentages)
SELECT
city,
COUNT(*) AS customer_count,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM customers), 2) AS percentage
FROM customers
GROUP BY city
ORDER BY customer_count DESC;
Date-based Grouping
-- Daily sales for current month
SELECT
DATE(order_date) AS order_day,
COUNT(*) AS orders_count,
SUM(total_amount) AS daily_sales
FROM orders
WHERE MONTH(order_date) = MONTH(CURRENT_DATE())
AND YEAR(order_date) = YEAR(CURRENT_DATE())
GROUP BY DATE(order_date)
ORDER BY order_day;
Combining WHERE and HAVING
-- High-value customers from specific cities
SELECT
city,
COUNT(*) AS customer_count,
AVG(total_spent) AS avg_spent_per_customer
FROM (
SELECT
c.city,
c.customer_id,
SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE c.city IN ('New York', 'Los Angeles', 'Chicago')
GROUP BY c.city, c.customer_id
) customer_spending
GROUP BY city
HAVING AVG(total_spent) > 500;
NULL Handling in Aggregates
-- Aggregate functions ignore NULL values
SELECT
COUNT(*) AS total_customers, -- Counts all rows
COUNT(phone) AS customers_with_phone, -- Counts non-NULL phone values
COUNT(DISTINCT city) AS unique_cities -- Counts unique non-NULL cities
FROM customers;
Window Functions Preview
For more advanced analytics, you can use window functions (available in newer MySQL/MariaDB versions):
-- Running total of sales
SELECT
order_date,
total_amount,
SUM(total_amount) OVER (ORDER BY order_date) AS running_total
FROM orders
ORDER BY order_date;
Performance Tips
- Index GROUP BY columns: Create indexes on frequently grouped columns
- Limit result sets: Use LIMIT when you don't need all groups
- Use HAVING efficiently: Filter as much as possible with WHERE before grouping
- Consider subqueries: Sometimes breaking complex queries into steps improves performance
Common Mistakes to Avoid
- Selecting non-grouped columns: All SELECT columns must be in GROUP BY or be aggregate functions
- Using WHERE with aggregates: Use HAVING instead of WHERE for aggregate conditions
- Forgetting ORDER BY: GROUP BY doesn't guarantee order; use ORDER BY for sorted results
Practice Exercises
- Find the total number of orders and total sales amount for each customer
- Calculate the average price of products in each category
- Find cities with more than 10 customers and their average order value
- Create a monthly sales report showing growth compared to the previous month
Next Steps
In the next lesson, we'll explore JOIN operations to combine data from multiple tables and create more comprehensive queries.