Module III - Relational Model & Relational Algebra

1. Introduction to Relational Model

What is the Relational Model?

The Relational Model of data is based on the concept of a Relation, which is a mathematical concept based on the ideas of sets. It was introduced by E.F. Codd in 1970 and provides a formal foundation for relational database management systems.

📊

Informal Definition

A relation is simply a table of values:

A set of rows (tuples)
A set of columns (attributes)
Each row represents a real-world entity or relationship
Each column has a name (attribute name)

📐

Formal Definition

A relation schema R(A1, A2, ..., An) is defined over attributes A1, A2, ..., An, where each attribute has a domain.

CUSTOMER(Cust-id, Cust-name, Address, Phone#)
-- Cust-id domain: 6-digit numbers
-- Cust-name domain: strings of 25 chars
                        

2. Relational Model Concepts

Informal Term	Formal Term	Description
Table	Relation	A two-dimensional structure containing data
Column	Attribute / Domain	A named column of a relation
Row	Tuple	A single row of a relation
Values in a column	Domain	Set of allowed values for an attribute
Table Definition	Schema (Intension)	The logical structure of the relation
Populated Table	Extension (Instance)	The actual data in the relation at a point in time

Example: A Tuple

A tuple is an ordered set of values derived from appropriate domains:

<632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
-- This tuple belongs to the CUSTOMER relation
                    

Characteristics of Relations

Ordering of Tuples

Tuples are NOT considered to be ordered. The order in which rows appear doesn't matter - they form a set, not a list.

Atomic Values

All values are considered atomic (indivisible). This is a requirement for First Normal Form (1NF).

NULL Values

The special value NULL represents unknown or inapplicable values. It causes complications in many operations.

Attribute Names

Each attribute must have a unique name within a relation. Attributes in R(A1, A2, ..., An) are considered ordered.

3. Relational Model Constraints

Constraints are conditions that must hold on all valid relation instances. There are three main types:

🔑

Key Constraints

Define uniqueness requirements on attributes. No two tuples can have the same value for the key attribute(s).

✅

Entity Integrity

Primary key attributes cannot have NULL values. Every tuple must be uniquely identifiable.

🔗

Referential Integrity

Foreign key values must either match a primary key value in the referenced relation or be NULL.

4. Concept of Keys

Exam Tip

Understanding the hierarchy: Super Key → Candidate Key → Primary Key is essential. Be clear on definitions with examples.

1

Super Key

A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. May contain extra attributes.

Example: For INSTRUCTOR relation with attributes (ID, name, dept_name, salary):

{ID}, {ID, name}, {ID, salary}, {ID, name, salary} are all super keys

2

Candidate Key

A minimal super key - removal of any attribute from it results in a set that is NOT a super key. Also called "minimal super key".

Example: For CAR(State, Reg#, SerialNo, Make, Model, Year):

Key1 = {State, Reg#} - Candidate key
Key2 = {SerialNo} - Candidate key
{SerialNo, Make} - Super key but NOT a candidate key (not minimal)

3

Primary Key

One candidate key chosen by the database designer to uniquely identify tuples. Primary key attributes are typically underlined in the schema.

EMPLOYEE(SSN, Name, Address, Salary)

4

Foreign Key

An attribute (or set of attributes) in one relation that references the primary key of another relation. Used to establish relationships between tables.

EMPLOYEE(SSN, Name, DNO)
DEPARTMENT(DNO, DName, Location)
-- DNO in EMPLOYEE references DNO in DEPARTMENT

5

Secondary Key

A candidate key that is not selected as the primary key. Also called alternate keys.

Prime vs Non-Prime Attributes

Prime Attribute

An attribute that is a member of some candidate key.

Non-Prime Attribute

An attribute that is NOT a member of any candidate key.

5. Integrity Constraints

🛡️

Entity Integrity Constraint

The primary key attributes PK of each relation schema R cannot have NULL values in any tuple of r(R).

t[PK] ≠ NULL for any tuple t in r(R)

This is because primary key values are used to identify individual tuples.

🔗

Referential Integrity Constraint

A constraint involving two relations. The foreign key value must either:

Match an existing primary key value in the referenced table, OR
Be NULL (if allowed)

Ensures relationships between tables remain consistent.

Update Operations and Constraints

When constraints are violated during INSERT, DELETE, or UPDATE operations, several actions can be taken:

REJECT: Cancel the operation that causes the violation
CASCADE: Propagate changes to related tuples
SET NULL: Set foreign key values to NULL
SET DEFAULT: Set to a default value

6. Mapping ER Model to Relational Model

Exam Tip

This is a frequently asked 10-mark question: "Draw an ER diagram and map it to relational schema". Know all 8 steps of the mapping algorithm.

ER-to-Relational Mapping Algorithm

Step 1: Mapping of Regular (Strong) Entity Types

For each strong entity E, create a relation R with all simple attributes of E. Choose one key attribute as primary key.

EMPLOYEE(SSN, Fname, Lname, Address, Salary)
DEPARTMENT(DNUMBER, Dname, Location)
PROJECT(PNUMBER, Pname, Plocation)

Step 2: Mapping of Weak Entity Types

For each weak entity W with owner entity E, create a relation R with all attributes of W. Include owner's primary key as foreign key. Primary key = Owner's PK + Partial key of W.

DEPENDENT(ESSN, Dependent_Name, Sex, BDate, Relationship)

Step 3: Mapping of Binary 1:1 Relationship Types

Three approaches:

Foreign Key approach: Add FK to one relation (preferably with total participation)
Merged relation: Combine both entities (when both have total participation)
Cross-reference: Create a third relation

DEPARTMENT(DNUMBER, Dname, Mgr_SSN, Mgr_Start_Date)

Step 4: Mapping of Binary 1:N Relationship Types

Include the primary key of the "one" side as a foreign key in the "many" side relation.

EMPLOYEE(SSN, Fname, Lname, DNO)
-- DNO references DEPARTMENT (1:N - one dept has many employees)

Step 5: Mapping of Binary M:N Relationship Types

Create a new relation S with primary keys of both participating entities as foreign keys. Their combination forms the primary key of S. Include any relationship attributes.

WORKS_ON(ESSN, PNO, Hours)

Step 6: Mapping of Multivalued Attributes

Create a new relation R with the attribute A plus the primary key K of the parent entity as foreign key. Primary key = (K, A).

DEPT_LOCATIONS(DNUMBER, DLOCATION)

Step 7: Mapping of N-ary Relationship Types

Create a new relation S with primary keys of all participating entities as foreign keys, plus any relationship attributes.

SUPPLY(SNAME, PARTNO, PROJNAME, Quantity)

Step 8: Mapping of Specialization/Generalization

Option 8A: Create relations for superclass AND each subclass (subclass has superclass PK as FK)

Option 8B: Create relations ONLY for subclasses with all inherited attributes (works for total specialization)

Option 8C: Single relation with type attribute (for disjoint)

Option 8D: Single relation with multiple Boolean type attributes (for overlapping)

ER Model Construct	Relational Model
Entity type	"Entity" relation
1:1 or 1:N relationship	Foreign key (or "relationship" relation)
M:N relationship	"Relationship" relation with two FKs
N-ary relationship	"Relationship" relation with n FKs
Simple attribute	Attribute
Composite attribute	Set of simple component attributes
Multivalued attribute	Relation and foreign key
Value set	Domain

7. Introduction to Relational Algebra

What is Relational Algebra?

Relational Algebra is a procedural query language that consists of a set of operations that take one or two relations as input and produce a new relation as output. It provides the theoretical foundation for SQL.

Characteristics

Procedural: Specifies what data AND how to retrieve it
Closure property: Output of an operation is also a relation
Foundation for SQL: SQL is based on relational algebra
Operations can be combined into expressions

Types of Operations

Unary: Operate on one relation (SELECT, PROJECT, RENAME)
Binary: Operate on two relations (UNION, INTERSECTION, DIFFERENCE)
Join Operations: Combine related tuples
Aggregate: Compute summary values

Six Fundamental Operations

σ

Select

Horizontal filter (rows)

π

Project

Vertical filter (columns)

∪

Union

Combine tuples from both

−

Set Difference

Tuples in first, not second

×

Cartesian Product

All combinations

ρ

Rename

Rename relation/attributes

8. Unary Relational Operations

σ SELECT Operation (Selection)

Selects tuples (rows) that satisfy a given predicate. It is a horizontal filter that produces a subset of tuples.

Notation

σ_p(r) where p is the selection predicate

Examples

σDNO=4(EMPLOYEE)
-- Select all employees where department number is 4

σSALARY > 30000(EMPLOYEE)
-- Select employees with salary greater than 30,000

σdept_name='IT' ∧ marks > 60(student)
-- Select IT students with marks > 60
                        

Properties:

Result has the same schema (attributes) as the input relation
SELECT is commutative: σ_c1(σ_c2(R)) = σ_c2(σ_c1(R))
Cascaded SELECTs can be combined: σ_c1(σ_c2(R)) = σ_{c1 ∧ c2}(R)

π PROJECT Operation (Projection)

Returns the relation with only specified attributes (columns). It is a vertical filter.

Notation

π_{A1, A2, ..., Ak}(r) where A1, A2, ..., Ak are attribute names

Examples

πLNAME, FNAME, SALARY(EMPLOYEE)
-- Get only name and salary columns

πID, name, salary(instructor)
-- Project ID, name, salary from instructor relation
                        

Properties:

Duplicate rows are removed (since relations are sets)
Number of tuples in result ≤ number in input
If projection includes a key, tuple count remains same

ρ RENAME Operation

Allows renaming of relations and/or their attributes. Critical for self-joins and clarifying results.

Notation

ρ_{S(B1, B2, ..., Bn)}(R) - Rename R to S with new attribute names

ρ_S(R) - Just rename relation to S

Example: Self-Join

-- Find employees who earn more than their supervisor
DEP5_EMPS ← σDNO=5(EMPLOYEE)
RESULT ← πFNAME, LNAME, SALARY(DEP5_EMPS)

-- Using rename for self-join
ρd(instructor) -- Rename instructor as 'd' for comparison
                        

SELECT and PROJECT Visualization

9. Set Theory Operations

Type Compatibility (Union Compatibility)

For UNION, INTERSECTION, and DIFFERENCE operations, the two relations must be type compatible:

Same number of attributes
Corresponding attributes have compatible domains

∪ UNION Operation

Returns all tuples that are in either R or S or both. Duplicate tuples are eliminated.

r ∪ s = {t | t ∈ r OR t ∈ s}

-- Example: Find SSNs of employees who work in dept 5 OR supervise someone in dept 5
DEP5_EMPS ← σDNO=5(EMPLOYEE)
RESULT1 ← πSSN(DEP5_EMPS)
RESULT2(SSN) ← πSUPERSSN(DEP5_EMPS)
RESULT ← RESULT1 ∪ RESULT2
                    

∩ INTERSECTION Operation

Returns all tuples that are in BOTH R and S.

r ∩ s = {t | t ∈ r AND t ∈ s}

-- Example: Find people who are both students AND instructors
πName(STUDENT) ∩ πName(INSTRUCTOR)
                    

INTERSECTION can be expressed using other operations: R ∩ S = R − (R − S)

− SET DIFFERENCE (MINUS) Operation

Returns all tuples that are in R but NOT in S.

r − s = {t | t ∈ r AND t ∉ s}

-- Example: Find courses taught in Fall 2009 but NOT in Spring 2010
πcourse_id(σsemester='Fall' ∧ year=2009(section)) −
πcourse_id(σsemester='Spring' ∧ year=2010(section))
                    

Note: SET DIFFERENCE is NOT commutative: R − S ≠ S − R

× CARTESIAN PRODUCT Operation

Combines every tuple of R with every tuple of S. If R has n_r tuples and S has n_s tuples, result has n_r × n_s tuples.

r × s = {t q | t ∈ r AND q ∈ s}

-- Example
FEMALE_EMPS ← σSEX='F'(EMPLOYEE)
EMPNAMES ← πFNAME, LNAME, SSN(FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES × DEPENDENT
                    

Cartesian product alone is rarely useful. It's typically combined with SELECT to create meaningful joins.

Set Operations Visualization

10. Binary Relational Operations

÷ DIVISION Operation

Used when we need to find tuples in one relation that are related to ALL tuples in another relation. Useful for "for all" type queries.

Definition

R(Z) ÷ S(X) where X ⊆ Z. Let Y = Z − X. The result contains tuples t such that for every tuple s in S, there is a tuple (t, s) in R.

Example

Find all customers who have accounts at ALL branches located in Brooklyn:

πcustomer_name, branch_name(depositor ⋈ account) ÷
πbranch_name(σbranch_city='Brooklyn'(branch))
                        

11. Join Operations

⋈ NATURAL JOIN

Combines tuples from two relations based on common attributes. Automatically matches attributes with the same name and removes duplicate columns.

-- Natural join of instructor and teaches on common attribute ID
instructor ⋈ teaches

-- Result has attributes: ID, name, dept_name, salary, course_id, sec_id, semester, year
                    

Danger of Natural Join

Beware of unrelated attributes with the same name getting matched incorrectly!

⋈_θ THETA JOIN (Conditional Join)

Combines Cartesian product with a selection condition θ. More general than natural join.

R ⋈condition S = σcondition(R × S)

-- Example: Join DEPARTMENT and EMPLOYEE where manager SSN matches
DEPARTMENT ⋈MGRSSN=SSN EMPLOYEE
                    

=⋈ EQUIJOIN

A theta join where the condition uses only equality (=) comparisons. Most common type of join.

-- Find manager name for each department
DEPT_MGR ← DEPARTMENT ⋈MGRSSN=SSN EMPLOYEE
                    

Outer Joins

Outer joins preserve tuples that don't have matching tuples in the other relation by padding with NULLs.

⟕ LEFT OUTER JOIN

Keeps all tuples from the LEFT relation. Unmatched tuples get NULL values for right relation attributes.

course ⟕ prereq

⟖ RIGHT OUTER JOIN

Keeps all tuples from the RIGHT relation. Unmatched tuples get NULL values for left relation attributes.

course ⟖ prereq

⟗ FULL OUTER JOIN

Keeps all tuples from BOTH relations. Unmatched tuples from either side get NULLs.

course ⟗ prereq

12. Aggregate Functions

Aggregate Functions

Functions that compute summary values from collections of data. Cannot be expressed in basic relational algebra.

Common Functions

COUNT: Number of values/tuples
SUM: Sum of numeric values
AVG: Average of numeric values
MAX: Maximum value
MIN: Minimum value

Notation

grouping_attrsℱfunction_list(R)

-- Examples:
ℱMAX Salary(Employee)
-- Maximum salary

DNOℱCOUNT SSN, AVG Salary(Employee)
-- Count and average per department
                        

13. Relational Algebra Query Examples

Q1: Retrieve name and address of all employees who work for the 'Research' department

▼

RESEARCH_DEPT ← σDNAME='Research'(DEPARTMENT)
RESEARCH_EMPS ← (RESEARCH_DEPT ⋈DNUMBER=DNO EMPLOYEE)
RESULT ← πFNAME, LNAME, ADDRESS(RESEARCH_EMPS)
                            

Q2: Retrieve the names of employees who have no dependents

▼

ALL_EMPS ← πSSN(EMPLOYEE)
EMPS_WITH_DEPS(SSN) ← πESSN(DEPENDENT)
EMPS_WITHOUT_DEPS ← ALL_EMPS − EMPS_WITH_DEPS
RESULT ← πLNAME, FNAME(EMPS_WITHOUT_DEPS ⋈ EMPLOYEE)
                            

Q3: Find the largest salary in the university

▼

-- Step 1: Find salaries that are NOT maximum (less than some other)
NOT_MAX ← πinstructor.salary(σinstructor.salary < d.salary(instructor × ρd(instructor)))

-- Step 2: Maximum = All salaries - Non-maximum salaries
MAX_SALARY ← πsalary(instructor) − NOT_MAX
                            

Q4: Find courses offered in Fall 2009 AND Spring 2010

▼

FALL_2009 ← πcourse_id(σsemester='Fall' ∧ year=2009(section))
SPRING_2010 ← πcourse_id(σsemester='Spring' ∧ year=2010(section))
RESULT ← FALL_2009 ∩ SPRING_2010
                            

Q5: Find names of instructors in Physics along with course IDs they taught

▼

-- Method 1:
πinstructor.ID, course_id(σdept_name='Physics'(σinstructor.ID=teaches.ID(instructor × teaches)))

-- Method 2 (more efficient):
πinstructor.ID, course_id(σinstructor.ID=teaches.ID(σdept_name='Physics'(instructor) × teaches))
                            

14. Practice Questions

Explain the following Relational Algebra operators with suitable examples: Select, Project, Union, Rename, Natural Join [10 marks]

▼

1. SELECT (σ) - Selection Operation:

Unary operator that selects tuples satisfying a predicate
Horizontal filtering of rows
Notation: σ_p(r) where p is the predicate

Example: σ_{salary > 50000}(EMPLOYEE) - selects employees with salary > 50000

2. PROJECT (π) - Projection Operation:

Unary operator that returns specified attributes
Vertical filtering of columns
Removes duplicate tuples

Example: π_{name, salary}(EMPLOYEE) - projects only name and salary

3. UNION (∪):

Binary operator combining tuples from both relations
Relations must be union-compatible
Eliminates duplicates

Example: π_name(STUDENT) ∪ π_name(INSTRUCTOR)

4. RENAME (ρ):

Renames relation and/or attributes
Used for self-joins and clarifying expressions

Example: ρ_{S(ID, Name)}(EMPLOYEE) - renames to S with new attribute names

5. NATURAL JOIN (⋈):

Combines tuples from two relations on common attributes
Automatically matches same-named attributes
Removes duplicate attributes

Example: EMPLOYEE ⋈ DEPARTMENT - joins on common DNO

Define: Super Key, Candidate Key, Primary Key, Foreign Key with examples [5 marks]

▼

Super Key: A set of attributes that uniquely identifies each tuple. May contain extra attributes.

Example: {SSN}, {SSN, Name}, {SSN, Salary} are all super keys of EMPLOYEE

Candidate Key: A minimal super key - removing any attribute makes it non-unique.

Example: For CAR relation - {State, Reg#} and {SerialNo} are candidate keys

Primary Key: The candidate key chosen by designer to identify tuples. Underlined in schema.

Example: EMPLOYEE(SSN, Name, Address) - SSN is primary key

Foreign Key: Attribute(s) in one relation that reference the primary key of another relation.

Example: DNO in EMPLOYEE references DNUMBER in DEPARTMENT

Explain ER to Relational Mapping algorithm with all steps [10 marks]

▼

Step 1 - Strong Entity: Create relation with all simple attributes. Entity's key becomes PK.

Step 2 - Weak Entity: Create relation with all attributes + owner's PK as FK. PK = Owner's PK + Partial key.

Step 3 - Binary 1:1: Add FK to one relation (preferably total participation side).

Step 4 - Binary 1:N: Add PK of "1" side as FK in "N" side relation.

Step 5 - Binary M:N: Create new relation with both PKs as FKs, forming composite PK.

Step 6 - Multivalued Attribute: Create new relation with attribute + owner's PK.

Step 7 - N-ary Relationship: Create relation with all participating PKs as FKs.

Step 8 - Specialization/Generalization: Options include separate tables for super/subclasses or combined tables with type attributes.

Explain different types of Join operations in Relational Algebra [5 marks]

▼

1. Natural Join (⋈): Joins on all common attributes, removes duplicates.

2. Theta Join (⋈_θ): Cartesian product followed by selection on condition θ.

3. Equijoin: Theta join where condition uses only equality (=).

4. Left Outer Join (⟕): Keeps all left tuples, NULLs for unmatched right.

5. Right Outer Join (⟖): Keeps all right tuples, NULLs for unmatched left.

6. Full Outer Join (⟗): Keeps all tuples from both, NULLs where no match.

Relational Model & Relational Algebra

100% Exam Frequency

1. Introduction to Relational Model

What is the Relational Model?

Informal Definition

Formal Definition

2. Relational Model Concepts

Example: A Tuple

Characteristics of Relations

Ordering of Tuples

Atomic Values

NULL Values

Attribute Names

3. Relational Model Constraints

Key Constraints

Entity Integrity

Referential Integrity

4. Concept of Keys

Exam Tip

Super Key

Candidate Key

Primary Key

Foreign Key

Secondary Key

Prime vs Non-Prime Attributes

Prime Attribute

Non-Prime Attribute

5. Integrity Constraints

Entity Integrity Constraint

Referential Integrity Constraint

Update Operations and Constraints

6. Mapping ER Model to Relational Model

Exam Tip

ER-to-Relational Mapping Algorithm

Step 1: Mapping of Regular (Strong) Entity Types

Step 2: Mapping of Weak Entity Types

Step 3: Mapping of Binary 1:1 Relationship Types

Step 4: Mapping of Binary 1:N Relationship Types

Step 5: Mapping of Binary M:N Relationship Types

Step 6: Mapping of Multivalued Attributes

Step 7: Mapping of N-ary Relationship Types

Step 8: Mapping of Specialization/Generalization

7. Introduction to Relational Algebra

What is Relational Algebra?

Characteristics

Types of Operations

Six Fundamental Operations

Select

Project

Union

Set Difference

Cartesian Product

Rename

8. Unary Relational Operations

σ SELECT Operation (Selection)

Notation

Examples

Properties:

π PROJECT Operation (Projection)

Notation

Examples

Properties:

ρ RENAME Operation

Notation

Example: Self-Join

SELECT and PROJECT Visualization

9. Set Theory Operations

Type Compatibility (Union Compatibility)

∪ UNION Operation

∩ INTERSECTION Operation

− SET DIFFERENCE (MINUS) Operation

× CARTESIAN PRODUCT Operation

Set Operations Visualization

10. Binary Relational Operations

÷ DIVISION Operation

Definition

Example

11. Join Operations

⋈ NATURAL JOIN

Danger of Natural Join

⋈_θ THETA JOIN (Conditional Join)