Tags:conceptdatabasenormalizationfunctionaldependencies Status:🟩
Normalization
Summary
Normalization organizes data in a relational database to reduce redundancy and improve data integrity. It minimizes anomalies during data operations and enhances efficiency. The process involves evaluating relations against four normal forms and decomposing those that do not meet the criteria.
Details
Normalization is the process of organizing data in a relational database to reduce redundancy and improve data integrity. By structuring data into related tables, it minimizes anomalies during data operations and enhances the efficiency and maintainability of the database.
Goals of normalization The primary goal of normalization is to ensure that no anomalies occur during data insertion, deletion, or update. This is achieved by analyzing the given relations based on their functional dependencies and candidate keys. As a result, the risk of inconsistent updates is reduced, and storage space is utilized efficiently.
Levels of understanding At the logical level, normalization helps users easily understand the meaning of data and formulate correct queries. At the implementation level, it ensures that the database operates effectively without redundancy.
Normal forms and decomposition There are four main normal forms in normalization, with higher normal forms having stricter constraints. The normalization procedure involves evaluating relations against these normal form tests sequentially. Relations that do not meet the requirements are decomposed into smaller, more manageable relations.
Informal Normalization Guidelines
These are some guidelines to go by when normalizing tables.
- The name of the relation should be meaningful.
- Attribute types from multiple entity types should not be combined in a single relation.
- Avoid excessive amounts of NULL values in a relation. Maybe most records don’t have a value for a specific attribute type, hence NULL will be used. Instead split it up into relationship table, so the records that holds a value for the specific attribute type can get associated.

1NF
Rules:
- All attributes are of atomic types. No lists or sets as attribute type.
There could be 2 ways to solve this:
- Separate rows: Create a new record with the same user and different email. However this would create 2 records with the same id which might be a problem if its a unique key.
- New table: Decompose the email into a new table that contains
idandmail. Then its possible to associate the id with name, age etc. in one table and the id with email in another table.
2NF
Rules:
- Is on 1NF
- Non-key attributes fully (not partial) depend on the whole candidate key(s).
![]() | ![]() |
|---|---|
The keys are manufacturer and model. Manufacturercountry. | |
We have a non key attribute country. It depends on the manufacturer but that is just a part of the key, so it depends only partially on the key. |
3NF
Rules:
- Is on 2NF
- No non-prime attribute type of R is transitively dependent on the primary key.
![]() | ![]() |
|---|---|
| A position determines the salary. Hence its redundant to write the salary in the same table as the employees. If the salary changes for a position it has to be changed in every attribute who has that position. |
BCNF
Rules:
- Is on 3NF
- Every determinant must be a candidate key
| Old | New | New |
|---|---|---|
![]() | ![]() | ![]() |
course_id→room(A course determines the room it’s taught in).instructor→room(An instructor determines the room they teach in). In this case, even though bothcourse_idandinstructordetermineroom, neither is a candidate key—since the combination ofcourse_idandinstructoruniquely identifies rows. This violates BCNF because we have a determinant (instructor) that isn’t a candidate key.
To fix this we need to break the table into two tables that eliminate the problem.
Now, each determinant (both course_id and instructor) is a candidate key in its respective table, ensuring the tables are in BCNF.
Guide to normalize
- Identify ‘bad’ functional dependencies that break BCNF/3NF.
- If there are such, then decompose tables into two tables.
- Check again and repeat for sub-tables.
- We don’t decompose from 3NF to BCNF.
Practical decomposition
-
Consider relation R and (important) functional dependency that violates 3NF/BCNF.
- Decompose into and where
- (everything but Y = the right side)
- (the whole FD = both left and right side)
-
This has the following properties
- is (normally) in BCNF
- Joining and (with = on all X attributes) yields R
-
Example: Person(ID, Name, ZIP, City) and ZIP City
- X = ZIP, Y = City
- = R-Y = Person(ID, Name, ZIP)
- = XY = ZIP(ZIP, City)
Normalization dependency diagram
This diagram can be used to see which normal form the table is on.
There are attributes ABCDE. A and B are the keys.
The lowest NF that applies is the one that counts.
Examples







