Normalization Process: 4NF, 5NF - CSU357 - Shoolini U

Normal Forms - 4NF, 5NF

1. Fourth Normal Form (4NF)

The Fourth Normal Form (4NF) is a level of database normalization designed to reduce data redundancy in relational databases. It builds upon the Third Normal Form (3NF) by addressing multi-valued dependencies, and it is essential for ensuring data integrity.

A relation is in 4NF if it is in 3NF and does not contain any multi-valued dependencies unless they are trivial. A multi-valued dependency occurs when, for a given combination of columns in a table, multiple rows list combinations of values that are independent of each other.

1.1 Understanding Multi-valued Dependencies

Multi-valued dependency exists when, for two attributes in a relation, one attribute determines multiple values of another. The notation A →→ B denotes that B is multi-valued dependent on A.

Example: Consider a table with columns StudentID, Course, and Hobby. If a student is enrolled in multiple courses and also has multiple hobbies, the courses are independent of hobbies, creating a multi-valued dependency between StudentID and both Course and Hobby.

1.2 Achieving 4NF

To transform a relation into 4NF, you must:

  • Ensure the relation is in 3NF.
  • Identify and remove multi-valued dependencies by decomposing the original relation into smaller, well-structured relations that preserve dependencies without redundancy.

Decomposition Example:

CREATE TABLE Students (
    StudentID INT,
    Course VARCHAR(100),
    PRIMARY KEY (StudentID, Course)
);

CREATE TABLE Hobbies (
    StudentID INT,
    Hobby VARCHAR(100),
    PRIMARY KEY (StudentID, Hobby)
);

This decomposition eliminates the multi-valued dependency by creating two separate tables, one for courses and one for hobbies, linked by StudentID.

2. Fifth Normal Form (5NF)

Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), is the pinnacle of database normalization designed to reduce redundancy in relational databases by decomposing tables in a way that they can be reconstructed by joins.

A relation is in 5NF if, and only if, every join dependency in it is implied by the candidate keys. This means that the data cannot be further decomposed without losing information, ensuring that the database is free of redundancy beyond what is achievable through 4NF.

2.1 Understanding Join Dependencies

A join dependency exists when a table can be recreated by joining multiple tables that have been decomposed from it. These decomposed tables are projections of the original table.

For a relation R with subsets R1, R2, ..., Rn, a join dependency is denoted as *(R1, R2, ..., Rn) and indicates that relation R can be reconstructed by joining R1, R2, ..., Rn.

2.2 Achieving 5NF

To achieve 5NF, perform the following steps:

  • Ensure the relation is in 4NF.
  • Analyze the relation for join dependencies that are not implied by candidate keys.
  • Decompose the relation into smaller relations that can be joined to produce the original relation without loss of information.

Decomposition Example:

CREATE TABLE CourseInstructors (
    CourseID INT,
    InstructorID INT,
    PRIMARY KEY (CourseID, InstructorID)
);

CREATE TABLE CourseRooms (
    CourseID INT,
    RoomID INT,
    PRIMARY KEY (CourseID, RoomID)
);

CREATE TABLE InstructorRooms (
    InstructorID INT,
    RoomID INT,
    PRIMARY KEY (InstructorID, RoomID)
);

This example demonstrates decomposing a relation into smaller relations that preserve join dependencies without redundancy, achieving 5NF.