Table of Contents Link to heading
- Aggregate Queries
COUNT
functionSUM
functionAVG
functionMAX
andMIN
functions- Maths in
SELECT
clause - Multiple aggregate queries
GROUP BY
clause- Aggregate queries with
JOINS
HAVING
clause- HAVING and WHERE clauses
- Complete syntax of an SQL query
Aggregate Queries Link to heading
Aggregate queries or aggregation is a method of performing computations over sets of numerical values in multiple tuples of a relation and returning a single value.
SQL provides users with many aggregate functions for aggregate queries, including AVG, COUNT, SUM, MIN, MAX, etc.
Aggregate functions can be used within three clauses, namely SELECT, HAVING, and ORDER BY.
- Group BY allows us to partition our relations into groups and then we can compute our aggregate values over each group independently.
- The HAVING condition allows us to filter our results based on aggregate values.
With the exception of COUNT, all aggregate functions:
- Apply to a single attribute.
- Ignore NULL values when performing the calculation.
COUNT
function
Link to heading
This function returns the number of tuples (an integer) that matches a specified criterion.
COUNT() syntax Link to heading
COUNT ( [DISTINCT] expression | * )
Argument | Semantic |
---|---|
DISTINCT | COUNT will operate only on one unique instance of every value |
expression | An expression of any type; no aggregate functions or subqueries |
* | Specify that all tuples should be counted; not support DISTINCT |
Examples Link to heading
Given the following example database.
๐ Find the number of employees.
SELECT COUNT (E.*) AS employeeCount
FROM Employee AS E;
๐ Find the number of tuples having a nonnull value for salary.
SELECT COUNT (E.salary) AS employeeCount
FROM Employee AS E;
๐ Find the number of different values of salary.
SELECT COUNT (DISTINCT E.salary) AS employeeCount
FROM Employee AS E;
Note: DISTINCT will only return the first of duplicate values.
- If two people earn 40k, it will only be counted once, not twice.
SUM
function
Link to heading
This function returns the sum of all the numerical values of an attribute (specified by the expression).
SUM() syntax Link to heading
SUM ( [DISTINCT] expression )
Argument | Semantic |
---|---|
DISTINCT | SUM will operate only on one unique instance of every value |
expression | A constant, column, function, or any combination of arithmetic, bitwise, and string operators; no aggregate functions or subqueries |
Return Types Link to heading
Return the summation of all expression values in the most precise expression data type.
Expression result | Return type |
---|---|
tinyint | int |
smallint | int |
int | int |
bigint | bigint |
decimal category (p, s) | decimal(38, s) |
money and smallmoney category | money |
float and real category | float |
๐ Microsoft
Example Link to heading
Given the following example database.
๐ Find the sum of the salaries of the Administration department.
SELECT SUM (E.salary) AS salarySum
FROM Employee AS E
WHERE E.dept = 'Administration';
AVG
function
Link to heading
This function returns the average of all the numerical values of an attribute (specified by the expression).
AVG() syntax Link to heading
AVG ( [DISTINCT] expression )
Argument | Semantic |
---|---|
DISTINCT | AVG will operate only on one unique instance of every value |
expression | An expression of the exact numeric or approximate numeric data type category (-bit); no aggregate functions or subqueries |
Return Types Link to heading
The evaluated result of expression determines the return type.
Expression result | Return type |
---|---|
tinyint | int |
smallint | int |
int | int |
bigint | bigint |
decimal category (p, s) | decimal(38, max(s,6)) |
money and smallmoney category | money |
float and real category | float |
๐ Microsoft
Example Link to heading
Given the following example database.
๐ Find the average of the salaries of the Production department.
SELECT AVG (E.salary) AS salaryAverage
FROM Employee AS E
WHERE E.dept = 'Production';
MAX
and MIN
functions
Link to heading
These functions return maximum or minimum value in the expression.
- The return type is the same as the expression).
- MAX returns NULL when there is no row to select.
MAX() | MIN() syntax Link to heading
MAX | MIN ( [DISTINCT] expression )
Argument | Semantic |
---|---|
DISTINCT | MAX and MIN will ignore duplicate values |
expression | An expression of the exact numeric or approximate numeric data type category (-bit); no aggregate functions or subqueries |
Examples Link to heading
Given the following example database.
๐ For expressions that are ‘strings’ (city, firstName), the alphabetical order goes from the smallest letter (a) to the largest letter (w).
๐ For expressions that are ‘date’ (dateOfBirth, orderDate), such as a list of dates between 2002-07-03 and 2022-03-22).
SELECT MIN (C.date) FROM Customer AS C;
-- will return 2002-07-03
SELECT MAX (C.date) FROM Customer AS C;
-- will return 2022-03-22
Maths in SELECT
clause
Link to heading
Multiple aggregate queries Link to heading
Multiple aggregate queries can be combined into a singe query within the SELECT clause.
However, it is only acceptable when each aggregate queries in the combination only returns a single value.
SELECT (
MAX (E.Salary) AS [Max Salary],
MIN (E.Salary) AS [Min Salary]
FROM Employee AS E
Result:
Max Salary | Min Salary |
---|---|
80 | 40 |
GROUP BY
clause
Link to heading
Given the following example database.
๐ Find the highest salary paid in each department.
When conducting aggregate queries, there is an error that is often encountered.
SELECT E.dept, MAX (E.salary) AS [Highest Salary in this Department]
FROM Employee AS E;
Our query completed with an error, displaying:
Msg 8120, Level 16, State 1, Line 1
Column 'Employee.firstName' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.
Rationale: in the SELECT clause, especially
- All non-aggregate attributes in the SELECT clause must appear in the GROUP BY clause.
- The GROUP BY clause collects data across multiple records and then groups the results by one or more attributes.
- The GROUP BY clause may have more attributes than those non-aggregate attributes in the SELECT clause.
Solution: use a GROUP BY clause followed by the non-aggregate attribute(s), as instructed in the error message.
SELECT E.dept, MAX (E.salary) AS [Highest Salary in this Department]
FROM Employee AS E
GROUP BY E.delp;
Result:
dept | Highest Salary in this Department |
---|---|
Administration | 45 |
Distribution | 80 |
Planning | 80 |
Production | 50 |
Use with other Keywords Link to heading
The GROUP BY clause can be implemented with many other clauses, such as AS, TOP, ORDER BY, DISTINCT, etc.
For example:
- This will display the same result as above but with Department names being placed in the reversed order.
SELECT E.dept, MAX (E.salary) AS [Highest Salary in this Department]
FROM Employee AS E
GROUP BY E.dept
ORDER BY E.dept DESC;
- This will display only the first result.
SELECT TOP 1 E.dept, MAX (E.salary) AS [Highest Salary in this Department]
FROM Employee AS E
GROUP BY E.dept
Multiple non-aggregate attributes Link to heading
When there are multiple regular, non-aggregate attributes in the
Two non-aggregate attributes Link to heading
๐ Find the highest salary paid in each office in each department.
There are two ways of listing them. The result will be sorted based on the second column in the GROUP BY clause.
- Listing in the original order.
- The result will
ORDER BY
E.office
SELECT E.dept, E.office, MAX (E.salary)
FROM Employee AS E
GROUP BY E.firstName, E.office;
- Listing in the reversed order.
- The result will
ORDER BY
E.firstName
SELECT E.dept, E.office, MAX (E.salary)
FROM Employee AS E
GROUP BY E.office, E.firstName;
More than two non-aggregate columns Link to heading
๐ Find the highest salary paid in each office in each department in each city.
There are many ways of listing them. The result will be sorted based on the first column in the GROUP BY clause.
In the below example, the result will ORDER BY
E.city
SELECT E.dept, E.office, E.city, MAX (E.salary)
FROM Employee AS E
GROUP BY E.city, E.office, E.dept;
Aggregate queries with JOINS
Link to heading
Given the following example database.
๐ Find the maximum salary among the employees who work in a department based in London.
- Join Employee with Department to take into account what city they work in (not live in!) โ filter on London.
- This returns many tuples.
- Find the maximum salary from the filtered result โ MAX(Salary).
- This returns the single value.
SELECT MAX (E. salary) AS MaxLondonSal
FROM Employee AS E INNER JOIN Department AS D
ON E.dept = D.deptName
WHERE d.city = 'London';
HAVING
clause
Link to heading
The HAVING clause is used to place conditions on the result of an aggregate operator.
This clause is only used in the presence of aggregation.
For example, given the following example database.
๐ Find which departments spend more than 150 on salaries.
Correct query:
SELECT E.dept, SUM (E.salary) AS theSalary
FROM Employee AS E
GROUP BY E.dept
HAVING SUM (E.salary) > 150;
Incorrect query:
SELECT E.dept, SUM (E.salary) AS theSalary
FROM Employee AS E
GROUP BY E.dept
HAVING theSalary > 150;
Result:
dept | theSalary |
---|---|
Planning | 153 |
HAVING and WHERE clauses Link to heading
The HAVING
clause allows us to apply conditions to the whole result of
aggregate functions and to check conditions that apply to the
whole group.
- Conditions involving aggregate operators must appear in a HAVING clause.
Whereas the WHERE
clause applies conditions to 1 tuple at a time.
- Conditions involving non-aggregate operators must appear in the WHERE clause.
For example, given the following example database.
๐ Find the departments in which the average salary of employees working in office 20 is higher than 25.
This will require a combination of a normal condition (WHERE clause) and a aggregate condition (HAVING clause).
SELECT E.dept, AVG (E.salary)
FROM Employee AS E
WHERE E.office = '20'
GROUP BY E.dept
HAVING AVG (E.salary) > 25;
Complete syntax of an SQL query Link to heading
SELECT <selectList>
FROM <tableList>
WHERE <condition>
GROUP BY <groupingAttributeList>
HAVING <aggregateCondition>
ORDER BY <orderingAttributeList>