1 / 45

Decomposition and Functional Dependency

I am a pig. Then, you must have 4 legs!. Decomposition and Functional Dependency. Outline. Redundancy Decomposition at first glance Functional dependency Dependency properties. Why?. We have learnt that, an ER diagram can be directly converted to relational tables.

xuxa
Download Presentation

Decomposition and Functional Dependency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I am a pig. Then, you must have 4 legs! Decomposition andFunctional Dependency

  2. Outline • Redundancy • Decomposition at first glance • Functional dependency • Dependency properties

  3. Why? • We have learnt that, an ER diagram can be directly converted to relational tables. • However, these tables may contain redundancy, namely, repetition of the same information. • To eliminate redundancy (as much as possible), we need to refine the relational tables (known as normalization in db terminology).

  4. An example of redundancy

  5. An example of redundancy • Negative impacts of redundancy • Higher space consumption. • Higher update overhead • Imagine the operations we need to do, if we raise the hourly-wage of B1 to 120. • Insertion/update anomaly • The DBA must prevent insertion of tuples resulting in inconsistent hourly-wages, e.g., (7, B1, 200). • Solution?

  6. An example of redundancy (cont.)

  7. An example of redundancy (cont.) rating id hourly-wages • In fact, the two tables could have been obtained directly, if we had designed a “perfect ER diagram”. • It is not realistic to assume that we can always discover the perfect ER diagram. • We need a tool to refine our design, even if we started from an imperfect diagram. have salary employee

  8. Outline • Redundancy • Decomposition at first glance • Functional dependency • Dependency properties

  9. Basic questions to ask • Do we need to decompose a relation? • What problems (if any) does a given decomposition cause? • Lossless-join • Dependency-preservation

  10. Decomposition • Just now, we decomposed EMPLOYEE into two tables, to avoid redundancy. • Note that, the new tables can reproduce the original EMPLOYEE. • This is a rule we must obey:the decomposed tables must be able to reproduce the original table. Lossless Join!

  11. Illegal decomposition • This decomposition violates the rule mentioned earlier. • How do we judge whether a decomposition is legal?

  12. Another illegal decomposition Lossless Join? NO! • The judgment is not that easy!

  13. legal decomposition • Checking the “legitimacy” of decomposition: • The new tables must have common attribute(s). • The common attribute(s) must be the candidate key of at least one new table.

  14. Some confusing notions Better Definitions Later • Key? An abbreviation of candidate key. • Candidate key? A minimal set of attributes that uniquely identifies every tuple. • Primary key? A candidate key selected by a database designer • Superkey Any superset of a candidate key.

  15. Decomposition may not be obvious • How to decompose the above table to minimize redundancy? • Before we can answer the question, we need to gain more understanding about redundancy.

  16. Outline • Redundancy • Decomposition at first glance • Functional dependency • Dependency properties

  17. Why does redundancy exist? • Reason: • ratingdetermineshourly-wages. • Once the tuple’s rating is known, its hourly-wages is also decided. • A concise representation: ratinghourly-wages.

  18. Functional dependency • ratinghourly-wages • is called a functional dependence (FD). • Do we have “ratingid”? • If tuple’s rating is known, are we sure about its id? • No.

  19. FD (cont.) • Do we have hourly-wagesrating? • Yes, different ratings havedifferent hourly-wages. • Namely, if we know a tuple’s hourly-wages, then its rating has only one possibility. • Do we have idrating? • Yes, because each employee has only a single rating.

  20. FD (cont.) • Do we have idid? • Of course, known as a trivial FD. • Do we have id (rating, hourly-wages)? • Yes, because each employee has only a single (rating, hourly-wages)-combination. My id = 1 You are at scale B1 and earn $100 per hour.

  21. FD (cont.) • Do we have (id, rating) hourly-wages? • In English, if a tuple’s (id, rating)-combination is decided, how many possibilities for hourly-wages? • In fact, once (id, rating) is decided, we know exactly which employee is concerned. • Therefore, hourly-wages has only one possibility. • So “(id, rating) hourly-wages” is true.

  22. Functional dependency definition • Let L and R be two sets of attributes. • LR means that • if we know a tuple’s L, then there is only a singlepossibility for the tuple’s R! • I.e., if we know L, we know R. • ratinghourly-wages • hourly-wagesrating • idrating • idid • id (rating, hourly-wages) • (id, rating) hourly-wages L R!

  23. Secret of redundancy • In general, a table has redundancy, if there is a FD, whose left handside is not a candidate key. • For example, the only candidate key of EMPLOYEE is id. • EMPLOYEE has redundancy, because we have ratinghourly-wages.

  24. Where are FDs from? • Two channels. • First, common senses. • HK-idname. • countrycapital. • (father, mother) eldest-child. • … • Second, special constraints of the underlying application. • If every employee has her/his own office • emp-idoffice-number. • If every customer can have a single account • cust-idacc-id.

  25. Outline • Redundancy • Decomposition at first glance • Functional dependency • Dependency properties

  26. A candidate key determines all • For example, a candidatekey of EMPLOYEE is id. • Thus, id determines any combination of the attributes. • idid • idrating • idhourly-wages • idrating, hourly-wages • idid,rating, hourly-wages • If we know the tuple’s id, then its any attribute has only 1 possibility.

  27. Super key determines all • A candidate key is id. • Then, (id, rating) determines any combination of the attributes. • (id, rating) id • (id, rating) rating • (id, rating) hourly-wages • (id, rating) rating, hourly-wages • (id, rating) id,rating, hourly-wages • We only need id to claim that the tuple’s any attribute has only 1 possibility. • So, of course, given its (id, rating), we can make the same claim.

  28. Trivial functional dependences • “id id”is trivially true. • Put it in English, and you willfind out. • “If we know a tuple’s id = 1, then we know its id.” I have 4 legs. Guess how many legs I have. Don’t waste my time.

  29. Trivial functional dependences (cont.) • L R is trivial, if L contains R. • Examples: • (id, rating)id • (id, rating)rating • (id, rating)(id, rating) • (id, hourly-wages)id • (id, rating, hourly-wages)(id, rating) • …

  30. Inference rules for FDs • Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold • One example: id rating rating hourly-wages What can we derive from these two FDs? id hourly-wages • More rules to come in next page

  31. Union • Given • cust-id cust-name (1) • cust-id cust-city (2) • we can derive • cust-id(cust-name, cust-city)(3) • Reasoning: • By (1), if we know cust-id, then we know cust-name. • By (2), if we know cust-id, then we know cust-city.. • Hence, if we know cust-id, then we know the (cust-name, cust-city)-combination.

  32. Transitivity • Given • creditcard-nocust-id (1) • cust-idcust-name (2) • we can derive • creditcard-nocust-name (3) • Reasoning: • By (1), if we know creditcard-no, then we know cust-id. • By (2), if we know cust-id, then we know cust-name. • Hence, if we know creditcard-no, then we know cust-name.

  33. Augmentation • Given • creditcard-no cust-id(1) • we can derive • (creditcard-no, branch-id)  (cust-id, branch-id) (2) • Reasoning: • By (1), if we know creditcard-no40101342, we know cust-id1. • Hence, if (40101342, B1) = (creditcard-no, branch-id) of a tuple, we know that (1, B1) = (stu-id , branch-id) of the tuple

  34. FD derivation • Given • creditcard-nocust-id (1) • (cust-id, branch-id)acc-id (2) • We can derive (creditcard-no, branch-id) acc-id as follows. • From(1), we have • (creditcard-no, branch-id)  (cust-id,branch-id)(3) • Augementation • From (3) and (2), by transitivity, we have • (creditcard-no, branch-id)  acc-id

  35. Summary of Inference Rules • Let R be a relation schema, W, X, Y,Z be subsets of R. • Reflexivity • If Y ⊆ X, then X Y (trivial FD’s) • Augmentation • If X Y, then XZ YZ, for every Z • Transitivity • If X Y and Y Z, then X Z • Union (Combining) Rule • If X Y and X Z, then X YZ • Decomposition (Splitting) Rule • If X YZ, then X Y and X  Z • Pseudo-transitivity Rule • If X Y and WY Z, then XW Z

  36. Prove FDs • Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C} • Prove or disprove F |= AE → C • {A→B, B →D, DE→C} • |= {A → D, DE→C} (Transitivity Rule: • If X Y and Y Z, then X Z) • |= {AE → C} (Pseudo-transitivity Rule • If X Y and WY Z, then XW Z)

  37. Disprove FDs • Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C} • Prove or disprove F |= A → C Find a counter example. ABCDE --------------------------- 24313 24614

  38. Closure Test • A standard way to test if FDs hold is to compute the closure of Y, denoted Y+ • Note that Y + is a set of attributes, not FDs • Basis step:Y+ = Y. • Induction: • Look for an FD’s left side X that is a subset of the current Y+ • If the FD is X -> A, add A to Y+.

  39. Prove FDs: Arevisit • Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C} • Prove or disprove F |= AE → C AE+=AEBDC SinceC  AE+, AE →C is implied by F

  40. Disprove FDs: A revisit • Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C} • Prove or disprove F |= A → C A+=ABD SinceC A+,A → CisnotimpliedbyF

  41. Find candidate keys • Consider a relation with schema R(A, B, C, D) and FD = {AB → C, C → D, D → A}.What are all the candidate keys of R?What are all the superkeys of R that are not candidate keys? • Listallpossiblecombinationsofallattributesandcheck! • A+=A;B+=B;C+=CDA;D+=DA • AB+=ABCD;AC+=ACD;AD+=AD;BC+=BCDA;BD+=BDAC CD+=CDA • ABC+=ABCD;ABD+=ABCD;ACD+=ACD;BCD+=ABCD • ABCD+=ABCD

  42. Can we do better? • Tricks for finding the key: • If an attribute neverappears on the RHS of any FD, it must be part of the key • If an attribute never appears on the LHS of any FD, but appears on the RHS of any FD, it must not be part of any key

  43. Anexample Consider R = {A, B, C, D, E, F, G, H} with a set of FDs F = {CD→A, EC→H, GHB→AB, C→D, EG→A, H→B, BE→CD, EC→B} Find all the candidate keys of R

  44. Anexample(cont.) F = {CD→A, EC→H, GHB→AB, C→D, EG→A, H→B, BE→CD, EC→B} • First, we notice that: • EFGnever appear on RHS of any FD. So, EFGmust be part of ANY key of R • Anever appears on LHS of any FD, but appears on RHS of some FD. So, Ais not part of ANY key of R • We now see if EFG is itself a key… • EFG+ = EFGA ≠ R; So, EFGalone is not key

  45. Anexample(cont.) • Checking by adding single attribute with EFG(except A): • BEFG+ = ABCDEFGH = R; it’s a key [BE→CD, EG→A, EC→H] • CEFG+ = ABCDEFGH = R; it’s a key [EG→A, EC→H, H→B, BE→CD] • DEFG+ = ADEFG ≠ R; it’s not a key [EG→A] • EFGH+ = ABCDEFGH = R; it’s a key [EG→A, H→B, BE→CD] • If we add any further attribute(s), they will form the superkey. Therefore, we can stop here searching for candidate key(s). • Therefore, candidate keys are: {BEFG, CEFG, EFGH}

More Related