220 likes | 287 Views
Normalization. 1NF, 2NF, 3NF. Introduction. In Kp.88 we have the suppliers and parts database S { S#, SNAME, STATUS, CITY} P { P#, PNAME, COLOR, WEIGHT, CITY} SP { S#, P#, QTY} What happens if the design is changed in some way like Supplier ’ s CITY is inserted in SP SCP
E N D
Normalization 1NF, 2NF, 3NF
Introduction • In Kp.88 we have the suppliers and parts database • S {S#, SNAME, STATUS, CITY} • P {P#, PNAME, COLOR, WEIGHT, CITY} • SP {S#, P#, QTY} • What happens if the design is changed in some way like • Supplier’s CITY is inserted in SP SCP • See Kp. 408 Fig.11.1
Introduction (cont.1) • In Fig. 11.1(kp.408), the sample value for relvar SCP • There are many redundant information • S1’s city London: 6 times • S4’s city London: 3 times • What’ll be remained if update on S1’s city happens incorrectly due to the redundancy? • Some S1’s city is London • Some S1’s city is Amsterdam • …… good design principle “One fact in one place” = avoiding redundancy
Introduction (cont.2) • Normalization • Concerns about how a given relation containing certain undesirable properties can be converted to a more desirable form • Normal forms • If a relation satisfies a certain specified set of constraints
Normal forms • Levels of normalization • Universe of relvars • Normalized and unnormalized • 1NF relvars • Normalized • 2NF relvars • 3NF relvars • BCNF relvars • 4NF relvars • 5NF relvars • See Kp. 411 Fig.11.2 • The normalization procedure is reversible • Ex) 2NF 3NF • No information is lost More highly normalized
Nonloss Decomposition • Nonloss decomposition satisfies the following two properties • Breaking down a relvar does not lose information • Reversibility • The original relvar is equal to the join of decomposed relvars (join of its projections) • Correct further normalization has to satisfy this property!
Example of nonloss decomposition • S{S#, STATUS, CITY} is decomposed into two ways • SST{S#, STATUS}, SC{S#, CITY} nonloss • SST{S#, STATUS}, STC{STATUS. CITY} lossy
FD diagrams • Pictorial representation of FDs • Ex) FD diagrams for relvars S, SP, and P
First Normal Form • A relvar is in 1NF • Iff every tuple contains exactly one value for each attribute in every legal value of the relvar • Ex) also see fig.11.6 (kp.418) FIRST{S#, STATUS, CITY, P#, QTY} PRIMARY KEY {S#, P#}
Anomalies in 1NF • Let’s think about anomalies due to the FD S#CITY • Insert • We cannot just insert a supplier’s city unless the supplier must supply at least one part • Ex) insertion of <S5, , Athens, , > primary key (S#, p#) value becomes null : not allowed
Anomalies in 1NF (cont.1) • Delete • If we delete a sole tuple for a particular supplier, we lose • not only his shipment • but also his city. • Ex) <S3, 10, Paris, P2, 200>: 4th from the bottom in fig.11.6 (kp.418)
Anomalies in 1NF (cont.2) • Update • If we update the city value for a particular supplier, we may have to update many tuples • Ex) <S1, London> <S1, Amsterdam>: • We have to update 6 tuples • May cause inconsistency if we miss updating any tuple
Solution for the anomalies of 1NF • Decompose relvar FIRST{S#, STATUS, CITY, P#, QTY} into 2 relvars • SECOND{S#, STATUS, CITY} • SP{S#, P#, QTY} • See fig. 11.8 (kp. 420)
Solution for the anomalies of 1NF(cont.) • New relvars • SECOND{S#, STATUS, CITY} • SP{S#, P#, QTY} • See fig. 11.8 (kp. 420) and check • Can we insert <S5, , Athens>? • Can we delete <S3, p2, 200> without deleting S3’s information in SECOND? • Can we update S1’s city in one tuple only?
Second Normal Form • A relvar is in 2NF iff • 1NF and • Every nonkey attribute is irreducibly dependent on the primary key • Ex) • FIRST{S#, STATUS, CITY, P#, QTY} • Not 2NF because of FDs S#CITY, S#STATUS • S# is reduced from the primary key {S#, P#} • SECOND{S#, STATUS, CITY}, SP{S#, P#, QTY} • 2NF
Problem of 2NF • Lack of mutual independence among its nonkey attributes • Ex) in SECOND{S#, STATUS, CITY}, SP{S#, P#, QTY}, • we still have an FD CITYSTATUS • Because S#CITY, CITYSTATUS, • S#STATUS : transitive dependency
Anomalies of SECOND • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Insert • We cannot insert the fact only that a particular city has a particular status (CITY STATUS) • We can insert the fact only when a supplier is actually in the city • Delete • If we delete a tuple in SECOND, we may delete the STATUS information of the CITY • Ex) if we delete <S5, 30, Athens> in Fig.11.8 (kp.420), we lose STATUS information of Athens also.
Anomalies of SECOND (cont.) • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Update • If we update the STATUS value for a particular CITY, we may have to update many tuples • Ex) <London, 20> <London, 30>: • We have to update 2 tuples • May cause inconsistency if we miss updating any tuple
Solution for Anomalies of SECOND • Decompose • SECOND{S#, STATUS, CITY}, • FD S#STATUS, S#CITY, CITYSTATUS • Into SC{S#, CITY}, CS{CITY, STATUS} • The effect of the decomposition is to eliminate the transitive dependencies • See fig.11.10 (kp. 423) for example tables
Third Normal Form • A relvar is 3NF iff • 2NF • Every nonkey attribute is nontransitively dependent on the primary key • In other words, no mutual dependency • Ex) SC, and CS: 3NF