In relational databases, normalization is the restructuring of database fields and tables. It eliminates redundancy, organizes data efficiently, reduces the potential for anomalies during data operations and improves data consistency. The formal classifications used for quantifying "how normalized" a relational database is are called normal forms (abbrev. NF).
A non-normalized database can suffer from data anomalies:
A non-normalized database may store data representing a particular referent in multiple locations. An update to such data in some but not all of those locations results in an update anomaly, yielding inconsistent data. A normalized database prevents such an anomaly by storing such data (i.e. data other than primary keys) in only one location.
A non-normalized database may have inappropriate dependencies, i.e. relationships between data with no functional dependencies. Adding data to such a database may require first adding the unrelated dependency. A normalized database prevents such insertion anomalies by ensuring that database relations mirror functional dependencies.
Similarly, a such dependencies in non-normalized databases can hinder deletion. That is, deleting data from such databases may require deleting data from the inappropriate dependency. A normalized database prevents such deletion anomalies by ensuring that all records are uniquely identifiable and contain no extraneous information.
Normalized databases have a design that reflects the true dependencies between tracked quantities, allowing quick updates to data with little risk of introducing inconsistencies. Instead of attempting to lump all information into one table, data is spread out logically into many tables. Normalizing the data is decomposing a single relation into a set of smaller relations which satisfy the constraints of the original relation. Redundancy can be solved by decomposing the tables. However certain new problems are caused by decomposition. Normalization helps us to make a conscious decision to avoid redundancy keeping the pros and cons in mind.
One can only describe a database as having a normal form if the relationships between quantities have been rigorously defined. It is possible to use set theory to express this knowledge once a problem domain has been fully understood, but most database designers model the relationships in terms of an "idealized schema". (The mathematical support came back into play in proofs regarding the process of transforming from one form to another.)
=============================================
Normal forms
Edgar F. Codd originally defined the first three normal forms. The ==>first normal form requires that tables be made up of a primary key and a number of atomic fields, and the second and third deal with the relationship of non-key fields to the primary key. These have been summarised as requiring that all non-key fields be dependent on "the key, the whole key and nothing but the key". In practice, most applications in 3NF are fully normalized. However, research has identified potential update anomalies in 3NF databases. BCNF is a further refinement of 3NF that attempts to eliminate such anomalies.
==>The fourth and fifth normal forms (4NF and 5NF) deal specifically with the representation of many-many and one-many relationships. Sixth normal form (6NF) only applies to temporal databases.
==>First normal form
Main article: First normal form
First normal form (1NF) lays the groundwork for an organised database design:
Ensure that each table has a primary key: an attribute or combination of attributes whose values are guaranteed to be different for every record of the table.
Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining key and non-key attributes appropriately.
Atomicity: Each attribute must contain a single value, not a set of values.
==>Second normal form
Main article: Second normal form
Second normal form (2NF) requires that data stored in a table with a composite primary key must not be dependent on only part of the table's primary key:
The database must meet all the requirements of the first normal form.
Data which is redundantly duplicated across multiple rows of a table is moved out to a separate table.
==>Third normal form
Main article: Third normal form
Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
The database must meet all the requirements of the second normal form.
Any field which is dependent not only on the primary key but also on another field is moved out to a separate table.
==>Boyce-Codd normal form
Main article: Boyce-Codd normal form
Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).
==>Fourth normal form
Main article: Fourth normal form
Fourth normal form (or 4NF) requires that there be no non-trivial multivalued dependencies of attribute sets on something other than a superset of a candidate key. A table is said to be in 4NF if and only if it is in the BCNF and multivalued dependencies are functional dependencies. The 4NF removes unwanted data structures: multivalued dependencies.
==>Fifth normal form
Main article: Fifth normal form
Fifth normal form (5NF and also PJ/NF) requires that there are no non-trivial join dependencies that do not follow from the key constraints. A table is said to be in the 5NF if and only if it is in 4NF and every join dependency in it is implied by the candidate keys.
==>Domain/key normal form
Main article: Domain/key normal form
Domain/key normal form (or DKNF) requires that the database contains no constraints other than domain constraints and key constraints.
==>Sixth normal form
It has been suggested that this section be split into a new article titled Sixth normal form. (Discuss)
This normal form was, as of 2005, only recently proposed: the sixth normal form (6NF) was only defined when extending the relational model to take into account the temporal dimension. Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. See work by Date, Darwen and Lorentzos [1] for a relational temporal extension, or see TSQL2 for a different approach.
Tree structure data implementation or Logic-Based Database Hierarchical model using relational model needs to be considered in addition to normalization for a useful database design.
2006-09-20 18:47:56
·
answer #1
·
answered by Amit G 4
·
0⤊
0⤋
Don't pay attention to all that previously Dikpedia pasted hoopla. If you have a range of data, just divide every value by the maximum value. That is normalization.
2006-09-21 01:52:44
·
answer #2
·
answered by x 5
·
0⤊
0⤋
http://www.datamodel.org/NormalizationRules.html
http://www.devbuilder.org/article/13
2006-09-21 01:45:05
·
answer #4
·
answered by yoursuperman000 2
·
0⤊
0⤋