1 / 45

Foundations of Database Systems

Foundations of Database Systems. Normalization Instructor: Zhijun Wang. Modification Anomalies. Deletion Anomaly Insertion Anomaly Update Anomaly. Modification Anomalies. The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press:.

kaycee
Download Presentation

Foundations of Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Foundations of Database Systems Normalization Instructor: Zhijun Wang DBMS and Applications

  2. Modification Anomalies • Deletion Anomaly • Insertion Anomaly • Update Anomaly DBMS and Applications

  3. Modification Anomalies • The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press: DBMS and Applications

  4. Normalization • Normalization is a process of evaluating and converting a relation to reduce modification anomalies • Essentially, normalization detects and eliminates data redundancy DBMS and Applications

  5. Normal Forms • 1NF– A table that qualifies as a relation is in 1NF • 2NF– A relation is in 2NF if all of its nonkey attributes are dependent on all of the primary key • Boyce-Codd Normal Form (BCNF)– A relation is in BCNF if every determinant is a candidate key • 3NF– A relation is in 3NF if it is in 2NF and has no determinants except the primary key DBMS and Applications

  6. Job Application Example Jbno Jsname JsQual …… Pemp …… DBMS and Applications

  7. First Normal Form (1NF) • To be in First Normal Form (1NF) a relation must have only single-valued attributes -- neither repeating groups nor arrays are permitted • UNF • Job-Seeker(Jbno, Jsname, (JsQual, Subject, level, award), (Pemp, employer, addr, pay)) • 1 NF • Job-Seeker(Jbno, Jsname) • (JsQual, Subject, level, award) • (Pemp, employer, addr, pay) DBMS and Applications

  8. Second Normal Form (2NF) • To be in Second Normal Form (2NF) the relation must be • in 1NF and • each non-key attribute must be dependent on the whole key (not a subset of the key) • 2NF • (Pemp, employer, addr, pay) => • (Pemp, pay), (employer, addr) DBMS and Applications

  9. Boyce-Codd Normal Form (BCNF) • Let R be a relation schema • X be a subset of the attributes of R • A be an attribute of R • R is in BCNF if • for every FD X  A, one of the following is true • A  X (it is a trivial FD), or X is a super-key. • Theoretically we must check each FD • X  A in F+ • It is actually sufficient to check • whether the LHS of each FD in F is a super-key. DBMS and Applications

  10. Testing for BCNF • E.g. R (A, B, C, D), with F = { A B, B C} • Decompose R into R1(A,B) and R2(A,C,D) • What is the key for R2 ? • Neither of the dependencies in F contain only attributes from (A,C,D) • so we might be mislead into thinking R2 satisfies BCNF. • In fact, dependency AC in F + shows R2 is not in BCNF. DBMS and Applications

  11. Testing for BCNF • R2(A,C,D) with functional dependence {AC} • Not trivial. • The key is (A, D) • A is not a superkey. • R2(A,C,D) is not in BCNF DBMS and Applications

  12. Lossless-join decomposition into BCNF • Given schema R • W  Schema R • A is a single attribute • W  A is a FD that violates BCNF • Decompose R into • R  A, and • WA • If either is not in BCNF, decompose recursively • the decomposition is lossless-join DBMS and Applications

  13. Example of BCNF Decomposition • R = (branch-name, branch-city, assets, customer-name, loan-number, amount)F = {branch-name  assets branch-city loan-number  amount branch-name}Key = {loan-number, customer-name} • Decomposition • R (F1) • R1 = (branch-name, branch-city, assets) • R2 = (branch-name, customer-name, loan-number, amount) • R2 (F2) • R3 = (branch-name, loan-number, amount) • R4 = (customer-name, loan-number) • Final decomposition R1, R3, R4 DBMS and Applications

  14. Example of BCNF Decomposition R = (branch-name, branch-city, assets,customer-name, loan-number, amount) branch-name  assets branch-city WA R-A R1 = (branch-name, branch-city, assets) R2 = (branch-name,customer-name, loan-number, amount) DBMS and Applications

  15. Example of BCNF Decomposition R2 = (branch-name,customer-name, loan-number, amount) loan-number  amount branch-name WA R-A R3 = (branch-name, loan-number, amount) R4 = (customer-name, loan-number) DBMS and Applications

  16. Multiple Ways to Decompose • If there are multiple dependencies (in a relation) that violate BCNF • There may be different ways to decompose, resulting in different collections of BCNF relations • A designer has to choose one based on the semantics of the applications. DBMS and Applications

  17. Steps to BCNF DBMS and Applications

  18. Putting a Relation into BCNF:EQUIPMENT_REPAIR DBMS and Applications

  19. Putting a Relation into BCNF:EQUIPMENT_REPAIR EQUIPMENT_REPAIR (ItemNumber, Type, AcquisitionCost, RepairNumber, RepairDate, RepairAmount) ItemNumber  (Type, AcquisitionCost) RepairNumber  (ItemNumber, Type, AcquisitionCost, RepairDate, RepairAmount) ITEM (ItemNumber, Type, AcquisitionCost) REPAIR (ItemNumber, RepairNumber, RepairDate, RepairAmount) Where REPAIR.ItemNumber must exist in ITEM.ItemNumber DBMS and Applications

  20. Putting a Relation into BCNF: New Relations DBMS and Applications

  21. Putting a Relation into BCNF: SKU_DATA DBMS and Applications

  22. Putting a Relation into BCNF: SKU_DATA SKU_DATA (SKU, SKU_Description, Department, Buyer) SKU  (SKU_Description, Department, Buyer) SKU_Description  (SKU, Department, Buyer) Buyer  Department SKU_DATA (SKU, SKU_Description, Buyer) BUYER (Buyer, Department) Where BUYER.Buyer must exist in SKU_DATA.Buyer DBMS and Applications

  23. Putting a Relation into BCNF: New Relations DBMS and Applications

  24. Third Normal Form (3NF) • To be in Third Normal Form (3NF) • the relation must be in 2NF • no transitive dependencies may exist within the relation. • A transitive dependency is when • an attribute is indirectly functionally dependent on the key (that is, the dependency is through another non-key attribute) DBMS and Applications

  25. Third Normal Form • Let • R be a relation schema • X be a subset of the attributes of R • A be an attribute of R • R is in 3NF if • for every FD X  A, one of the following is true • A  X (it is a trivial FD), or • X is a super-key, or • A is part of some key for R. {not for BCNF} • BCNF is stronger than 3NF • every BCNF relation is also in 3NF DBMS and Applications

  26. Violation of 3NF Transitivity problem Duplication of same data, e.g. (Randolph,fee) DBMS and Applications

  27. FD/Normalization Example • Consider a relations R with 5 attributes ABCDE. You are given the following dependencies: A->B, BC->E, ED->A • List all keys for R • Is R in 3NF? • Is R in BCNF? DBMS and Applications

  28. FD/Normalization Example • Take each FD in turn and find out the keys for R • ED->A • ED->A & A->B => ED->B • => ED->AB => EDC->ABC => EDC->ABCDE • Hence EDC is a key • BC -> E • BC->E => BCD->ED => BCD->A => BCD->ABCDE • Hence BCD is also a key • A->B • A->B => AC->BC • AC->BC & BC->E => AC->E => AC->BCE => ACD->BCDE =>ACD->ABCDE • Hence ACD is also a key DBMS and Applications

  29. FD/Normalization Example • Test the conditions for each FD to see if R is in 3NF • A->B • Not trivial. • A is not a superkey. • B is part of the key BCD. • BC-> E • Not trivial. • BC is not a superkey. • E is part of the key EDC. • ED->A • Not trivial. • ED is not a superkey. • A is part of the key ACD. R is in 3NF DBMS and Applications

  30. FD/Normalization Example • Test the conditions for each FD to see if R is in BCNF • A->B • Not trivial. • A is not a superkey. • BC-> E • Not trivial. • BC is not a superkey. • ED->A • Not trivial. • ED is not a superkey. R is not in BCNF DBMS and Applications

  31. 3NF vs BCNF Example Reserves ( sailorID, boatID, day, creditCardID ) • Consider FDsailorIDcreditCardID • i.e., a sailor uses a unique credit card to pay for reservations • sailorIDis not a superkey • creditCardID is not part of a key • this relation is not in 3NF DBMS and Applications

  32. 3NF vs BCNF Example • But if we also know FDcreditCardID sailorID • i.e., credit cards uniquely identify the owner • <creditCardID, boatID, day> is also a key • FD1SailorID  creditCardID does not violate 3NF • Because creditCardID is part of a key • FD2 creditCardID  SailorID (previous slide) does not violate 3NF • Reservesis in 3NF • but in all tuples with same S value, the same <S, C> pair is redundantly stored • Reserves is not in BCNF DBMS and Applications

  33. Bank Example Banker-info-schema = (branch-name, customer-name, banker-name, office-number) • The functional dependencies for this relation schema are:banker-name branch-name office-number customer-name branch-name banker-name • The key is: {customer-name, branch-name} DBMS and Applications

  34. Bank Example (branch-name, customer-name, banker-name, office-number) • Is it in 3NF? • banker-name branch-name office-number • Not trivial • Not superkey • NOT part of a candidate key • customer-name branch-name banker-name • Not trivial • Superkey • NOT Part of a candidate key It is not in 3NF DBMS and Applications

  35. Bank Example (branch-name, customer-name, banker-name, office-number) banker-namebranch-name office-number WA R-A (banker-name, branch-name, office-number) (customer-name, branch-name, banker-name) (customer-name, branch-name, banker-name) DBMS and Applications

  36. Applying 3NF to Banker-info-schema Banker-office-schema = (banker-name, branch-name, office-number) Banker-schema = (customer-name, branch-name, banker-name) DBMS and Applications

  37. Hourly_Emps Example • 2 FDs • I  (I,N,L,R,W,H) • R  W • Since • R is not a superkey and • W is not part of any key, • it is a violation of 3NF DBMS and Applications

  38. Decompositions • Decompose INLRWH into INLRH & RW • INLRH has {hkid, name, lot, rating, hours_worked} • FD I  INLRH • I is a key for INLRH • RW has {rating, hourly_wage} • FD R  W • R is a key for RW • Both schemas are in BCNF • This decomposition was guided by the FD (R  W) that violated 3NF • Method: remove W from the main schema DBMS and Applications

  39. Decomposition R  W (I,N,L,R,W,H) (R,W) (I,N,L,R,H) I  INLRH R  W BCNF!! BCNF!! DBMS and Applications

  40. Normalization: Advantages and Disadvantages DBMS and Applications

  41. Summary of Normal Forms DBMS and Applications

  42. Denormalization for Performance • De-Normalization (Controlled Redundancy) • May want to use non-normalized schema for performance • E.g. Two tables • Branch= (branch-name, branch-city,assets) • Loan-info = (customer-name, loan-number, branch-name, amount) • displaying branch-city along with loan-number and amount requires join of Branch with Loan-info • Solution: Use denormalized relation containing attributes of Branch as well as Loan-info with all above attributes • faster lookup • Extra space and extra execution time for updates • Extra coding work for programmer and possibility of error in extra code DBMS and Applications

  43. The Referential Integrity Constraint • A referential integrity constraint is a statement that limits the values of the foreign key to those already existing as primary key values in the corresponding relation DBMS and Applications

  44. Foreign Key with a Referential Integrity Constraint NOTE: The primary key of the relation is underlined and any foreign keys are in italics in the relations below: SKU_DATA (SKU, SKU_Description, Department, Buyer) ORDER_ITEM (OrderNumber, SKU, Quantity, Price, ExtendedPrice) Where ORDER_ITEM.SKU must exist in SKU_DATA.SKU DBMS and Applications

  45. Questions? DBMS and Applications

More Related