MySQL Aggregate Functions / Tractorscope Docs

Aggregate functions are essential tools in MySQL that perform calculations on a set of values and return a single result. They are commonly used with the

GROUP BY

clause to group rows that share common values and perform calculations on each group.

Overview

Aggregate functions operate on multiple rows of data and return a single summarized value. They are particularly useful for:

Calculating totals, averages, and counts
Finding minimum and maximum values
Performing statistical analysis
Creating summary reports

Core Aggregate Functions

1. COUNT()

The

COUNT()

function returns the number of rows that match specified criteria.

Syntax:

COUNT(expression)
COUNT(*)
COUNT(DISTINCT expression)

COUNT(expression)
COUNT(*)
COUNT(DISTINCT expression)

Examples:

-- Count all rows
SELECT COUNT(*) FROM employees;

-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;

-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;

-- Count with conditions
SELECT COUNT(*) FROM employees WHERE salary > 50000;

-- Count all rows
SELECT COUNT(*) FROM employees;

-- Count non-NULL values in a specific column
SELECT COUNT(email) FROM employees;

-- Count distinct values
SELECT COUNT(DISTINCT department) FROM employees;

-- Count with conditions
SELECT COUNT(*) FROM employees WHERE salary > 50000;

2. SUM()

The

SUM()

function calculates the total sum of numeric values.

Syntax:

SUM(expression)
SUM(DISTINCT expression)

SUM(expression)
SUM(DISTINCT expression)

Examples:

-- Sum all salaries
SELECT SUM(salary) FROM employees;

-- Sum with GROUP BY
SELECT department, SUM(salary) as total_salary 
FROM employees 
GROUP BY department;

-- Sum distinct values only
SELECT SUM(DISTINCT bonus) FROM employees;

-- Sum all salaries
SELECT SUM(salary) FROM employees;

-- Sum with GROUP BY
SELECT department, SUM(salary) as total_salary 
FROM employees 
GROUP BY department;

-- Sum distinct values only
SELECT SUM(DISTINCT bonus) FROM employees;

3. AVG()

The

AVG()

function calculates the average value of numeric data.

Syntax:

AVG(expression)
AVG(DISTINCT expression)

AVG(expression)
AVG(DISTINCT expression)

Examples:

-- Average salary
SELECT AVG(salary) FROM employees;

-- Average by department
SELECT department, AVG(salary) as avg_salary 
FROM employees 
GROUP BY department;

-- Average of distinct values
SELECT AVG(DISTINCT salary) FROM employees;

-- Average salary
SELECT AVG(salary) FROM employees;

-- Average by department
SELECT department, AVG(salary) as avg_salary 
FROM employees 
GROUP BY department;

-- Average of distinct values
SELECT AVG(DISTINCT salary) FROM employees;

4. MIN()

The

MIN()

function returns the smallest value in a set.

Syntax:

MIN(expression)

MIN(expression)

Examples:

-- Minimum salary
SELECT MIN(salary) FROM employees;

-- Minimum salary by department
SELECT department, MIN(salary) as min_salary 
FROM employees 
GROUP BY department;

-- Minimum date
SELECT MIN(hire_date) FROM employees;

-- Minimum salary
SELECT MIN(salary) FROM employees;

-- Minimum salary by department
SELECT department, MIN(salary) as min_salary 
FROM employees 
GROUP BY department;

-- Minimum date
SELECT MIN(hire_date) FROM employees;

5. MAX()

The

MAX()

function returns the largest value in a set.

Syntax:

MAX(expression)

MAX(expression)

Examples:

-- Maximum salary
SELECT MAX(salary) FROM employees;

-- Maximum salary by department
SELECT department, MAX(salary) as max_salary 
FROM employees 
GROUP BY department;

-- Latest hire date
SELECT MAX(hire_date) FROM employees;

-- Maximum salary
SELECT MAX(salary) FROM employees;

-- Maximum salary by department
SELECT department, MAX(salary) as max_salary 
FROM employees 
GROUP BY department;

-- Latest hire date
SELECT MAX(hire_date) FROM employees;

Advanced Aggregate Functions

6. GROUP_CONCAT()

The

GROUP_CONCAT()

function concatenates values from multiple rows into a single string.

Syntax:

GROUP_CONCAT([DISTINCT] expression [ORDER BY expression] [SEPARATOR separator])

GROUP_CONCAT([DISTINCT] expression [ORDER BY expression] [SEPARATOR separator])

Examples:

-- Concatenate employee names by department
SELECT department, GROUP_CONCAT(name) as employees 
FROM employees 
GROUP BY department;

-- With custom separator and ordering
SELECT department, 
       GROUP_CONCAT(name ORDER BY name SEPARATOR ', ') as employees 
FROM employees 
GROUP BY department;

-- Concatenate employee names by department
SELECT department, GROUP_CONCAT(name) as employees 
FROM employees 
GROUP BY department;

-- With custom separator and ordering
SELECT department, 
       GROUP_CONCAT(name ORDER BY name SEPARATOR ', ') as employees 
FROM employees 
GROUP BY department;

7. STDDEV() / STD()

Calculates the standard deviation of values.

Examples:

-- Standard deviation of salaries
SELECT STDDEV(salary) FROM employees;

-- Standard deviation by department
SELECT department, STDDEV(salary) as salary_stddev 
FROM employees 
GROUP BY department;

-- Standard deviation of salaries
SELECT STDDEV(salary) FROM employees;

-- Standard deviation by department
SELECT department, STDDEV(salary) as salary_stddev 
FROM employees 
GROUP BY department;

8. VARIANCE() / VAR_POP()

Calculates the variance of values.

Examples:

-- Variance of salaries
SELECT VARIANCE(salary) FROM employees;

-- Population variance
SELECT VAR_POP(salary) FROM employees;

-- Variance of salaries
SELECT VARIANCE(salary) FROM employees;

-- Population variance
SELECT VAR_POP(salary) FROM employees;

Using Aggregate Functions with GROUP BY

The

GROUP BY

clause is often used with aggregate functions to group rows with the same values in specified columns.

Basic GROUP BY Example:

SELECT department, 
       COUNT(*) as employee_count,
       AVG(salary) as avg_salary,
       MIN(salary) as min_salary,
       MAX(salary) as max_salary
FROM employees 
GROUP BY department;

SELECT department, 
       COUNT(*) as employee_count,
       AVG(salary) as avg_salary,
       MIN(salary) as min_salary,
       MAX(salary) as max_salary
FROM employees 
GROUP BY department;

Multiple Column Grouping:

SELECT department, job_title,
       COUNT(*) as count,
       AVG(salary) as avg_salary
FROM employees 
GROUP BY department, job_title;

SELECT department, job_title,
       COUNT(*) as count,
       AVG(salary) as avg_salary
FROM employees 
GROUP BY department, job_title;

HAVING Clause

The

HAVING

clause is used to filter groups created by

GROUP BY

based on aggregate function results.

Examples:

-- Departments with more than 5 employees
SELECT department, COUNT(*) as employee_count
FROM employees 
GROUP BY department 
HAVING COUNT(*) > 5;

-- Departments with average salary above 60000
SELECT department, AVG(salary) as avg_salary
FROM employees 
GROUP BY department 
HAVING AVG(salary) > 60000;

-- Departments with more than 5 employees
SELECT department, COUNT(*) as employee_count
FROM employees 
GROUP BY department 
HAVING COUNT(*) > 5;

-- Departments with average salary above 60000
SELECT department, AVG(salary) as avg_salary
FROM employees 
GROUP BY department 
HAVING AVG(salary) > 60000;

Practical Examples

Example 1: Sales Report

SELECT 
    YEAR(order_date) as year,
    MONTH(order_date) as month,
    COUNT(*) as total_orders,
    SUM(amount) as total_revenue,
    AVG(amount) as avg_order_value,
    MIN(amount) as min_order,
    MAX(amount) as max_order
FROM orders 
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY year, month;

SELECT 
    YEAR(order_date) as year,
    MONTH(order_date) as month,
    COUNT(*) as total_orders,
    SUM(amount) as total_revenue,
    AVG(amount) as avg_order_value,
    MIN(amount) as min_order,
    MAX(amount) as max_order
FROM orders 
GROUP BY YEAR(order_date), MONTH(order_date)
ORDER BY year, month;

Example 2: Employee Statistics

SELECT 
    department,
    COUNT(*) as headcount,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    SUM(salary) as total_payroll,
    GROUP_CONCAT(DISTINCT job_title) as positions
FROM employees 
GROUP BY department
HAVING COUNT(*) >= 3
ORDER BY avg_salary DESC;

SELECT 
    department,
    COUNT(*) as headcount,
    AVG(salary) as avg_salary,
    MIN(salary) as min_salary,
    MAX(salary) as max_salary,
    SUM(salary) as total_payroll,
    GROUP_CONCAT(DISTINCT job_title) as positions
FROM employees 
GROUP BY department
HAVING COUNT(*) >= 3
ORDER BY avg_salary DESC;

Example 3: Customer Analysis

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(amount) as total_spent,
    AVG(amount) as avg_order_value,
    MIN(order_date) as first_order,
    MAX(order_date) as last_order,
    DATEDIFF(MAX(order_date), MIN(order_date)) as customer_lifetime_days
FROM orders 
GROUP BY customer_id
HAVING COUNT(*) > 1
ORDER BY total_spent DESC;

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(amount) as total_spent,
    AVG(amount) as avg_order_value,
    MIN(order_date) as first_order,
    MAX(order_date) as last_order,
    DATEDIFF(MAX(order_date), MIN(order_date)) as customer_lifetime_days
FROM orders 
GROUP BY customer_id
HAVING COUNT(*) > 1
ORDER BY total_spent DESC;

Important Notes and Best Practices

NULL Handling

Most aggregate functions ignore NULL values
COUNT(*)
counts all rows including those with NULL values
COUNT(column_name)
only counts non-NULL values

Performance Considerations

Use indexes on columns used in GROUP BY clauses
Consider using covering indexes for better performance
Limit result sets when possible using HAVING or WHERE clauses

Common Pitfalls

Mixing aggregate and non-aggregate columns without GROUP BY will cause errors in strict mode
Using WHERE instead of HAVING for filtering aggregate results
Forgetting about NULL values in calculations

MySQL Modes

ONLY_FULL_GROUP_BY

mode (default in MySQL 5.7+), all non-aggregate columns in SELECT must be in GROUP BY clause or be functionally dependent on GROUP BY columns.

Conclusion

Aggregate functions are powerful tools for data analysis in MySQL. They enable you to:

Summarize large datasets efficiently
Generate meaningful reports and statistics
Perform complex analytical queries
Create dashboard-ready data summaries

Master these functions to unlock the full potential of your MySQL data analysis capabilities.