4. The Relational Model & Relational Algebra
Relational Algebra — The Mathematics Behind Every SQL Query
Every SQL query you write is secretly a mathematical expression. SQL was built as a practical implementation of a formal system called Relational Algebra. You will never write relational algebra in production — but understanding it explains why SQL behaves the way it does, clarifies complex join behaviour, and prepares you for the exam questions in your CSC 215/INF 215 course.
This module teaches you each operator, shows its mathematical notation, and maps it directly to SQL so the theory and the practice lock together in your mind.
Relations and Sets
In relational algebra, every table is treated as a mathematical set of tuples. Sets have one key property: no duplicates. Every member of a set is unique. This is why primary keys exist — they enforce the set property at the database level.
Relational algebra defines a collection of operations that take one or more relations as input and produce a new relation as output. You can chain these operations together, feeding the output of one into the input of another, building complex retrieval logic step by step.
The Selection Operator — Filtering Rows
The selection operator filters rows based on a condition. It is written using the Greek letter σ (sigma). The condition goes in parentheses after the sigma, and the relation name follows.
The formal notation looks like this:
σ(XP_Score > 1000)(Operators)
Reading it aloud: Select all tuples from the Operators relation where XP_Score is greater than 1000.
The equivalent SQL query is:
SELECT *
FROM Operators
WHERE XP_Score > 1000;
Notice that selection only filters rows. It never removes columns. The output relation has exactly the same columns as the input — just fewer rows.
The Projection Operator — Selecting Columns
The projection operator selects specific columns from a relation and discards the rest. It is written using the Greek letter π (pi). The column names go in parentheses after pi.
π(Callsign, XP_Score)(Operators)
Reading it: Project the Callsign and XP_Score columns from the Operators relation.
The equivalent SQL query is:
SELECT Callsign, XP_Score
FROM Operators;
Projection returns a relation with fewer columns. Because it returns a set, duplicate rows are automatically removed. This is why SELECT DISTINCT in SQL more precisely mirrors the mathematical projection — a plain SELECT can return duplicates, which a true set cannot contain.
Combining Selection and Projection
These two operators are most useful together. To get only the callsigns of operators with more than 1000 XP, you nest the selection inside the projection:
π(Callsign)(σ(XP_Score > 1000)(Operators))
Read from the inside outward: first apply the selection to get high-XP operators, then apply the projection to keep only the Callsign column. The equivalent SQL is:
SELECT Callsign
FROM Operators
WHERE XP_Score > 1000;
The Cartesian Product — Combining Two Relations
The Cartesian product, written as ×, takes every row from one relation and combines it with every row from another. If the Operators relation has 3 rows and the Squads relation has 4 rows, the Cartesian product has 12 rows — every possible combination.
On its own the Cartesian product is rarely useful because most combinations are meaningless. But it is the mathematical foundation that joins are built upon.
The Join — The Most Important Operation
The join operator — written as ⋈ (the bowtie symbol) — combines the Cartesian product with a selection condition. It takes rows from two relations and keeps only the pairs where a specified condition is true.
Operators ⋈(Operators.SquadID = Squads.SquadID) Squads
This reads: Join Operators and Squads, keeping only the pairs where their SquadID values match.
The equivalent SQL is:
SELECT *
FROM Operators
JOIN Squads ON Operators.SquadID = Squads.SquadID;
The optimizer never actually computes the full Cartesian product — it uses indexes to jump directly to matching rows. But mathematically, the join is defined as a product followed by a selection.
Set Operations — Union, Intersection, Difference
Relational algebra includes the standard mathematical set operations. For these to work, both relations must be union-compatible — they must have the same number of columns, and corresponding columns must have compatible data types.
Union (∪) combines all rows from both relations, removing duplicates. The SQL keyword is UNION.
Intersection (∩) returns only rows that appear in both relations. The SQL keyword is INTERSECT.
Difference (−) returns rows that are in the first relation but not in the second. The SQL keyword is EXCEPT in SQL Server.
The Rename Operator
The rename operator, written as ρ (rho), gives a relation or its attributes a new temporary name. This is necessary when you join a table with itself — called a self-join — because you need two distinct names for the same source table.
In SQL, renaming is done with the AS keyword. For example, to find pairs of operators who belong to the same squad:
SELECT
A.Callsign AS Operator1,
B.Callsign AS Operator2
FROM Operators AS A
JOIN Operators AS B ON A.SquadID = B.SquadID;
Here A and B are the renamed copies of the Operators relation — the rename operator in SQL form.
Academic Translator
Relational Algebra ➔ SQL
Projection (Pi) filters columns. Write the SQL equivalent to fetch only the 'handle' and 'rank' from the 'Operators' table.
Knowledge Check
Ready to test your understanding of 4. The Relational Model & Relational Algebra?