IST 210
Organization of Data
Normal Form Definitions & Examples

From Codd & Date


1NF A relation R is in first normal form (1NF) if and only if all underlying domains contain atomic values only

Example: 1NF but not 2NF

FIRST (supplier_no, status, city, part_no, quantity)

Functional Dependencies:

(supplier_no, part_no) quantity

(supplier_no) status

(supplier_no) city

city status (Supplier's status is determined by location)

Comments:

Non-key attributes are not mutually independent (city status).

Non-key attributes are not fully functionally dependent on the primary key (i.e., status and city are dependent on just part of the key, namely supplier_no).

Anomalies:

INSERT: We cannot enter the fact that a given supplier is located in a given city until that supplier supplies at least one part (otherwise, we would have to enter a null value for a column participating in the primary key C a violation of the definition of a relation).

DELETE: If we delete the last (only) row for a given supplier, we lose the information that the supplier is located in a particular city.

UPDATE: The city value appears many times for the same supplier. This can lead to inconsistency or the need to change many values of city if a supplier moves.

Decomposition (into 2NF):

SECOND (supplier_no, status, city)

SUPPLIER_PART (supplier_no, part_no, quantity)


2NF A relation R is in second normal form (2NF) if and only if it is in 1NF and every non-key attribute is fully dependent on the primary key

Example (2NF but not 3NF):

SECOND (supplier_no, status, city)

Functional Dependencies:

supplier_no status

supplier_no city

city status

Comments:

Lacks mutual independence among non-key attributes.

Mutual dependence is reflected in the transitive dependencies: supplier_no city, city status.

Anomalies:

INSERT: We cannot record that a particular city has a particular status until we have a supplier in that city.

DELETE: If we delete a supplier which happens to be the last row for a given city value, we lose the fact that the city has the given status.

UPDATE: The status for a given city occurs many times, therefore leading to multiple updates and possible loss of consistency.

Decomposition (into 3NF):

SUPPLIER_CITY (supplier_no, city)

CITY_STATUS (city, status)


3NF A relation R is in third normal form (3NF) if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. An attribute C is transitively dependent on attribute A if there exists an attribute B such that: A B and B C. Note that 3NF is concerned with transitive dependencies which do not involve candidate keys. A 3NF relation with more than one candidate key will clearly have transitive dependencies of the form: primary_key other_candidate_key any_non-key_column

An alternative (and equivalent) definition for relations with just one candidate key is:

A relation R having just one candidate key is in third normal form (3NF) if and only if the non-key attributes of R (if any) are: 1) mutually independent, and 2) fully dependent on the primary key of R. A non-key attribute is any column which is not part of the primary key. Two or more attributes are mutually independent if none of the attributes is functionally dependent on any of the others. Attribute Y is fully functionally dependent on attribute X if X Y, but Y is not functionally dependent on any proper subset of the (possibly composite) attribute X

For relations with just one candidate key, this is equivalent to the simpler:

A relation R having just one candidate key is in third normal form (3NF) if and only if no non-key column (or group of columns) determines another non-key column (or group of columns)

Example (3NF but not BCNF):

SUPPLIER_PART (supplier_no, supplier_name, part_no, quantity)

Functional Dependencies:

We assume that supplier_name's are always unique to each supplier. Thus we have two candidate keys:

(supplier_no, part_no) and (supplier_name, part_no)

Thus we have the following dependencies:

(supplier_no, part_no) quantity

(supplier_no, part_no) supplier_name

(supplier_name, part_no) quantity

(supplier_name, part_no) supplier_no

supplier_name supplier_no

supplier_no supplier_name

Comments:

Although supplier_name supplier_no (and vice versa), supplier_no is not a non-key column — it is part of the primary key! Hence this relation technically satisfies the definition(s) of 3NF (and likewise 2NF, again because supplier_no is not a non-key column).

Anomalies:

INSERT: We cannot record the name of a supplier until that supplier supplies at least one part.

DELETE: If a supplier temporarily stops supplying and we delete the last row for that supplier, we lose the supplier's name.

UPDATE: If a supplier changes name, that change will have to be made to multiple rows (wasting resources and risking loss of consistency).

Decomposition (into BCNF):

SUPPLIER_ID (supplier_no, supplier_name)

SUPPLIER_PARTS (supplier_no, part_no, quantity)


BCNF A relation R is in Boyce-Codd normal form (BCNF) if and only if every determinant is a candidate key

The definition of BCNF addresses certain (rather unlikely) situations which 3NF does not handle. The characteristics of a relation which distinguish 3NF from BCNF are given below. Since it is so unlikely that a relation would have these characteristics, in practical real-life design it is usually the case that relations in 3NF are also in BCNF. Thus many authors make a "fuzzy" distinction between 3NF and BCNF when it comes to giving advice on "how far" to normalize a design. Since relations in 3NF but not in BCNF are slightly unusual, it is a bit more difficult to come up with meaningful examples. To be precise, the definition of 3NF does not deal with a relation that:

  1. has multiple candidate keys, where

  2. those candidate keys are composite, and

  3. the candidate keys overlap (i.e., have at least one common attribute)

Example:

An example of a relation in 3NF but not in BCNF (and exhibiting the three properties listed) was given above in the discussion of 3NF. The following relation is in BCNF (and also in 3NF):

SUPPLIERS (supplier_no, supplier_name, city, zip)

We assume that each supplier has a unique supplier_name, so that supplier_no and supplier_name are both candidate keys.

Functional Dependencies:

supplier_no city

supplier_no zip

supplier_no supplier_name

supplier_name city

supplier_name zip

supplier_name supplier_no

Comments:

The relation is in BCNF since both determinants (supplier_no and supplier_name) are unique (i.e., are candidate keys).

The relation is also in 3NF since even though the non-primary-key column supplier_name determines the non-key columns city and zip, supplier_name is a candidate key. Transitive dependencies involving a second (or third, fourth, etc.) candidate key in addition to the primary key do not violate 3NF.

Note that even relations in BCNF can have anomalies.

Anomalies:

INSERT: We cannot record the city for a supplier_no without also knowing the supplier_name

DELETE: If we delete the row for a given supplier_name, we lose the information that the supplier_no is associated with a given city.

UPDATE: Since supplier_name is a candidate key (unique), there are none.

Decomposition:

SUPPLIER_INFO (supplier_no, city, zip)

SUPPLIER_NAME (supplier_no, supplier_name)

Larry Newcomer (Updated January 06, 2000 )