Aggregate functions are essential tools in MySQL that perform calculations on a set of values and return a single result. They are commonly used with the
GROUP BY
clause to group rows that share common values and perform calculations on each group.Overview
Aggregate functions operate on multiple rows of data and return a single summarized value. They are particularly useful for:
- Calculating totals, averages, and counts
- Finding minimum and maximum values
- Performing statistical analysis
- Creating summary reports
Core Aggregate Functions
1. COUNT()
The
COUNT()
function returns the number of rows that match specified criteria.Syntax:
COUNT(expression)
COUNT(*)
COUNT(DISTINCT expression)COUNT(expression)
COUNT(*)
COUNT(DISTINCT expression)Examples:
-- Count all rows
SELECT COUNT(*) FROM employees;
-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;
-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;
-- Count with conditions
SELECT COUNT(*) FROM employees WHERE salary > 50000;-- Count all rows
SELECT COUNT(*) FROM employees;
-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;
-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;
-- Count with conditions
SELECT COUNT(*) FROM employees WHERE salary > 50000;2. SUM()
The
SUM()
function calculates the total sum of numeric values.Syntax:
SUM(expression)
SUM(DISTINCT expression)SUM(expression)
SUM(DISTINCT expression)Examples:
-- Sum all salaries
SELECT SUM(salary) FROM employees;
-- Sum with GROUP BY
SELECT department, SUM(salary) as total_salary
FROM employees
GROUP BY department;
-- Sum distinct values only
SELECT SUM(DISTINCT bonus) FROM employees;-- Sum all salaries
SELECT SUM(salary) FROM employees;
-- Sum with GROUP BY
SELECT department, SUM(salary) as total_salary
FROM employees
GROUP BY department;
-- Sum distinct values only
SELECT SUM(DISTINCT bonus) FROM employees;3. AVG()
The
AVG()
function calculates the average value of numeric data.Syntax:
AVG(expression)
AVG(DISTINCT expression)AVG(expression)
AVG(DISTINCT expression)Examples:
-- Average salary
SELECT AVG(salary) FROM employees;
-- Average by department
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
-- Average of distinct values
SELECT AVG(DISTINCT salary) FROM employees;-- Average salary
SELECT AVG(salary) FROM employees;
-- Average by department
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
-- Average of distinct values
SELECT AVG(DISTINCT salary) FROM employees;4. MIN()
The
MIN()
function returns the smallest value in a set.Syntax:
MIN(expression)MIN(expression)Examples:
-- Minimum salary
SELECT MIN(salary) FROM employees;
-- Minimum salary by department
SELECT department, MIN(salary) as min_salary
FROM employees
GROUP BY department;
-- Minimum date
SELECT MIN(hire_date) FROM employees;-- Minimum salary
SELECT MIN(salary) FROM employees;
-- Minimum salary by department
SELECT department, MIN(salary) as min_salary
FROM employees
GROUP BY department;
-- Minimum date
SELECT MIN(hire_date) FROM employees;5. MAX()
The
MAX()
function returns the largest value in a set.Syntax:
MAX(expression)MAX(expression)Examples:
-- Maximum salary
SELECT MAX(salary) FROM employees;
-- Maximum salary by department
SELECT department, MAX(salary) as max_salary
FROM employees
GROUP BY department;
-- Latest hire date
SELECT MAX(hire_date) FROM employees;-- Maximum salary
SELECT MAX(salary) FROM employees;
-- Maximum salary by department
SELECT department, MAX(salary) as max_salary
FROM employees
GROUP BY department;
-- Latest hire date
SELECT MAX(hire_date) FROM employees;Advanced Aggregate Functions
6. GROUP_CONCAT()
The
GROUP_CONCAT()
function concatenates values from multiple rows into a single string.Syntax:
GROUP_CONCAT([DISTINCT] expression [ORDER BY expression] [SEPARATOR separator])GROUP_CONCAT([DISTINCT] expression [ORDER BY expression] [SEPARATOR separator])Examples:
-- Concatenate employee names by department
SELECT department, GROUP_CONCAT(name) as employees
FROM employees
GROUP BY department;
-- With custom separator and ordering
SELECT department,
GROUP_CONCAT(name ORDER BY name SEPARATOR ', ') as employees
FROM employees
GROUP BY department;-- Concatenate employee names by department
SELECT department, GROUP_CONCAT(name) as employees
FROM employees
GROUP BY department;
-- With custom separator and ordering
SELECT department,
GROUP_CONCAT(name ORDER BY name SEPARATOR ', ') as employees
FROM employees
GROUP BY department;7. STDDEV() / STD()
Calculates the standard deviation of values.
Examples:
-- Standard deviation of salaries
SELECT STDDEV(salary) FROM employees;
-- Standard deviation by department
SELECT department, STDDEV(salary) as salary_stddev
FROM employees
GROUP BY department;-- Standard deviation of salaries
SELECT STDDEV(salary) FROM employees;
-- Standard deviation by department
SELECT department, STDDEV(salary) as salary_stddev
FROM employees
GROUP BY department;8. VARIANCE() / VAR_POP()
Calculates the variance of values.
Examples:
-- Variance of salaries
SELECT VARIANCE(salary) FROM employees;
-- Population variance
SELECT VAR_POP(salary) FROM employees;-- Variance of salaries
SELECT VARIANCE(salary) FROM employees;
-- Population variance
SELECT VAR_POP(salary) FROM employees;Using Aggregate Functions with GROUP BY
The
GROUP BY
clause is often used with aggregate functions to group rows with the same values in specified columns.Basic GROUP BY Example:
SELECT department,
COUNT(*) as employee_count,
AVG(salary) as avg_salary,
MIN(salary) as min_salary,
MAX(salary) as max_salary
FROM employees
GROUP BY department;SELECT department,
COUNT(*) as employee_count,
AVG(salary) as avg_salary,
MIN(salary) as min_salary,
MAX(salary) as max_salary
FROM employees
GROUP BY department;Multiple Column Grouping:
SELECT department, job_title,
COUNT(*) as count,
AVG(salary) as avg_salary
FROM employees
GROUP BY department, job_title;SELECT department, job_title,
COUNT(*) as count,
AVG(salary) as avg_salary
FROM employees
GROUP BY department, job_title;HAVING Clause
The
HAVING
clause is used to filter groups created by GROUP BY
based on aggregate function results.Examples:
-- Departments with more than 5 employees
SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
-- Departments with average salary above 60000
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000;-- Departments with more than 5 employees
SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 5;
-- Departments with average salary above 60000
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000;Practical Examples
Example 1: Sales Report
SELECT
YEAR(order_date) as year,
MONTH(order_date) as month,
COUNT(*) as total_orders,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value,
MIN(amount) as min_order,
MAX(amount) as max_order
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY year, month;SELECT
YEAR(order_date) as year,
MONTH(order_date) as month,
COUNT(*) as total_orders,
SUM(amount) as total_revenue,
AVG(amount) as avg_order_value,
MIN(amount) as min_order,
MAX(amount) as max_order
FROM orders
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY year, month;Example 2: Employee Statistics
SELECT
department,
COUNT(*) as headcount,
AVG(salary) as avg_salary,
MIN(salary) as min_salary,
MAX(salary) as max_salary,
SUM(salary) as total_payroll,
GROUP_CONCAT(DISTINCT job_title) as positions
FROM employees
GROUP BY department
HAVING COUNT(*) >= 3
ORDER BY avg_salary DESC;SELECT
department,
COUNT(*) as headcount,
AVG(salary) as avg_salary,
MIN(salary) as min_salary,
MAX(salary) as max_salary,
SUM(salary) as total_payroll,
GROUP_CONCAT(DISTINCT job_title) as positions
FROM employees
GROUP BY department
HAVING COUNT(*) >= 3
ORDER BY avg_salary DESC;Example 3: Customer Analysis
SELECT
customer_id,
COUNT(*) as order_count,
SUM(amount) as total_spent,
AVG(amount) as avg_order_value,
MIN(order_date) as first_order,
MAX(order_date) as last_order,
DATEDIFF(MAX(order_date), MIN(order_date)) as customer_lifetime_days
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 1
ORDER BY total_spent DESC;SELECT
customer_id,
COUNT(*) as order_count,
SUM(amount) as total_spent,
AVG(amount) as avg_order_value,
MIN(order_date) as first_order,
MAX(order_date) as last_order,
DATEDIFF(MAX(order_date), MIN(order_date)) as customer_lifetime_days
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 1
ORDER BY total_spent DESC;Important Notes and Best Practices
NULL Handling
- Most aggregate functions ignore NULL values
- COUNT(*)counts all rows including those with NULL values
- COUNT(column_name)only counts non-NULL values
Performance Considerations
- Use indexes on columns used in GROUP BY clauses
- Consider using covering indexes for better performance
- Limit result sets when possible using HAVING or WHERE clauses
Common Pitfalls
- Mixing aggregate and non-aggregate columns without GROUP BY will cause errors in strict mode
- Using WHERE instead of HAVING for filtering aggregate results
- Forgetting about NULL values in calculations
MySQL Modes
In
ONLY_FULL_GROUP_BY
mode (default in MySQL 5.7+), all non-aggregate columns in SELECT must be in GROUP BY clause or be functionally dependent on GROUP BY columns.Conclusion
Aggregate functions are powerful tools for data analysis in MySQL. They enable you to:
- Summarize large datasets efficiently
- Generate meaningful reports and statistics
- Perform complex analytical queries
- Create dashboard-ready data summaries
Master these functions to unlock the full potential of your MySQL data analysis capabilities.