Normalization is a structural method whereby tables are broken down in a controlled manner with an aim of reducing data redundancy while at the same time enhancing the quality of stored data. It refers to the process of arranging the attributes and relations of a database in order to minimize data anomalies such as update, insert and delete anomalies. Normalization is usually a sequence of steps which are also called normal forms (NF).
The First Normal Form (1NF) and Second Normal Form (2NF) are very important towards the achievement of a normalized database. Where 1NF is centred on the removal of repeating groups, 2NF is associated with the issue of partial dependencies. With the aid of these normal forms, the databases yield high efficiency and can withstand peculiarities that are viewed as anomalies, thereby helping in proper organization of databases.
Second Normal Form
The second Normal Form (2NF) is based on the concept of fully functional dependency. The second Normal Form applies to relations with composite keys, that is, relations with a primary key composed of two or more attributes. A relation with a single-attribute primary key is automatically in at least 2NF. A relation that is not in 2NF may suffer from the update anomalies. To be in the second normal form, a relation must be in the first normal form and the relation must not contain any partial dependency. A relation is in 2NF if it has no partial dependency, i.e., no non-prime attribute (attributes that are not part of any candidate key), which is dependent on any proper subset of any candidate key in the table. In other words,
A relation that is in First Normal Form and every non-primary-key attribute is fully functionally dependent on the primary key, then the relation is in Second Normal Form (2NF).
Note – If the proper subset of the candidate key determines a non-prime attribute, it is called partial dependency . The normalization of 1NF relations to 2NF involves the removal of partial dependencies . If a partial dependency exists, we remove the partially dependent attribute(s) from the relation by placing them in a new relation along with a copy of their determinant. Consider the examples given below.
Example-1: Consider the table below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
{Note that, there are many courses having the same course fee. } Here, COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO; COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO; COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO; Hence, COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate key {STUD_NO, COURSE_NO} ; But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO, which is a proper subset of the candidate key. Non-prime attribute COURSE_FEE is dependent on a proper subset of the candidate key, which is a partial dependency and so this relation is not in 2NF. To convert the above relation to 2NF, we need to split the table into two tables such as : Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
2 C5
Note – 2NF tries to reduce the redundant data getting stored in memory. For instance, if there are 100 students taking C1 course, we dont need to store its Fee as 1000 for all the 100 records, instead once we can store it in the second table as the course fee for C1 is 1000.
Example-2: Consider following functional dependencies in relation R (A, B, C, D )
AB -> C [A and B together determine C]
BC -> D [B and C together determine D]
Answer:
First, we can check if there are any partial dependencies. A partial dependency occurs when a non-prime attribute (not part of any candidate key) depends on only part of a candidate key.
The candidate keys for relation R can be determined by finding the closure of each attribute:
AB determines every keys.
Now, let’s check for partial dependencies:
There are no partial dependencies in this relation because each non-prime attribute (C and D) depends on the whole candidate key(s) it is part of (AB and BC, respectively).
Therefore, the relation R is already in 3rd Normal Form (3NF) because it satisfies the conditions of 1st Normal Form (1NF) and 2nd Normal Form (2NF) and does not have any transitive dependencies.
What is Partial Dependency?
A case where the non prime attribute is totally dependent on only some of the components of a composite primary key . In a relation where the primary key is composed of more than one attributes if one or more non-prime attributes depend only on some of the attributes of the composite key they are said to satisfy partial dependency. This leads to redundancy and anomalies This refers to the duplication of effort and the existence of complex conditions which are conflicting in nature or of, quite a contradictory nature. For the eradication of the problem of partial dependency, relations need to be normalized to the second normal form which guarantees that all attributes which are not included in a key are fully determined by the whole key.
Conclusion
In conclusion, 2NF is a fundamental concept of database normalization that helps remove partial dependencies in your relational database. Following 2NF rules helps organize your database to avoid anomalies and ensure data integrity, making it easier to store and retrieve data.
Frequently Asked Questions on Second Normal Form (2NF) – FAQs
What is the purpose of normalization?
Database normalization is the act of structuring the data, in an organized manner to avoid repetition and data inconsistency. It helps in organizing the database in such a manner that it simplifies the storage process of information and at the same time minimizing on the occurrence of anomalies and enhancing the overall efficiency of data retrieval.
What is the First Normal Form (1NF)?
1NF removes repeating groups as well as other unnecessary repeating groups in a table through the practice of making all values atomic. That is, each cell in the table should contain scalar value, and there is no set or array value within the cell.
What is the Second Normal Form (2NF)?
This is as a result of the fact that 2NF caters for partial dependency in relations that have composite primary attributes. A relation is in 2NF if it is in 1NF and has no partial dependencies and all non prime attributes depend on the whole composite keys and not on the part of them.
What is a composite primary key?
A composite primary key is the primary key that contains two and more attributes A composite primary key is a primary key made up of two or more attributes. Coupled, it provides identification of a record within a table rather than identification by a single attribute as in the case with the traditional relational databases.
What is a candidate key?
A candidate key is can be defined as the attribute, or a combination of attributes, which can locate a record uniquely within relation. A relation can contain one or many candidate keys, nevertheless, only one, becomes the primary key.
What are functional dependencies?
Functional dependencies helps in explaining the structural interconnectivity betweens attributes in a relation. In plain language functional dependencies are basically thought of this way: if attribute A functionally determines attribute B, then means the value of attribute A defines the value of attribute B alone and vice versa.
How does 2NF help in improving database efficiency?
2NF minimise redundancy in a manner that makes non prime attributes completely dependent on the primary key. This is useful to avoid replicating data on the database, which results to optimizing memory used in inserting, updating or deleting records.
What are transitive dependencies?
Transitive dependency takes place whenever one or more than one non-Prime attributes are dependent on some other attribute which is not a candidate key. This occurs when a non-prime attribute is related to other non-prime attribute by means of the primary key. All forms of transitive dependencies, therefore, are removed in the third normal form databases.