1 / 88

B. Information Technology (Hons.) CMPB245: Database Design

B. Information Technology (Hons.) CMPB245: Database Design. Normalization Pt. 1. Objectives. Explain the purpose of normalization and the problems associated with redundant information Identify the types of update anomalies Explain the concept of functional dependency.

khanh
Download Presentation

B. Information Technology (Hons.) CMPB245: Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. B. Information Technology (Hons.)CMPB245:Database Design Normalization Pt. 1

  2. Objectives • Explain the purpose of normalization and the problems associated with redundant information • Identify the types of update anomalies • Explain the concept of functional dependency

  3. Normalization:The Purpose Objective 1

  4. Relational Data Analysis • Edgar Codd of IBM, proposed the application of mathematical set theory and algebra to the organization of data • Relational Data Analysis • Application of mathematical principles to the storage and manipulation of data, thereby reducing redundancy and increasing flexibility

  5. Relational Data Analysis • Prior to relational model • Data tended to be stored in a relatively ad-hoc fashion • File structures frequently mirrored the paper documents • Data duplication hence data redundancy was rife, leading to problems with maintenance and flexibility

  6. NormalizationPurpose • To develop a logical data model, we must • create an accurate representation of the data, its relationship and constraints • This is achieved by • creating a set of relations using a technique called Normalization

  7. NormalizationDefinitions • A technique for producing a set of rela-tions with desirable properties, based on the data requirements of the enterprise • A process of producing tables which con-form to specified standards, known as Normal Forms

  8. NormalizationDefinitions • A formal method that identifies relations based on their primary key and the functional dependencies among their attributes • A process of identifying and regrouping attributes so that data duplication and anomalies are avoided

  9. NormalizationInformation Reduncancy • One of the aims of database design is to group attributes into relations to minimize information redundancy • This also reduces the storage space • What sort of redundancy encountered and how can we avoid it?

  10. Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 NormalizationInformation Reduncancy • Consider the Staff_Branch relation which stores data about staffs working at a parti-cular branch STAFF_BRANCH

  11. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 NormalizationInformation Reduncancy • The details of a branch is repeated for every member of staff located at that branch

  12. Staff_No SName Position Salary Branch_No SL21 John White Manager 30000 B5 SG37 Ann Beech Sen. Assist. 12000 B3 SG14 David Ford Deputy 18000 B3 SA9 Mary Howe Assistant 9000 B7 SG5 Susan Brand Manager 24000 B3 SL41 Julie Lee Assistant 9000 B5 Branch_No BAddress Tel_No B5 22 Deer Rd., London 0171-8861212 B3 163 Main St., Glasgow 0141-3392178 B7 16 Argyll St., Aberdeen 01224-67125 STAFF BRANCH

  13. Update Anomalies:The Types Objective 2

  14. NormalizationUpdate Anomalies • Refers to the inconsistencies of updating (adding, modifying and deleting) informa-tion • Can be classified as • Insertion anomalies • Deletion anomalies • Modification anomalies

  15. Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 Update AnomaliesInsertion Anomalies (Case 1) • To insert details of new members of staff (in Staff_Branch relation) STAFF_BRANCH

  16. Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Dear Rd., London 0171-8861222 STAFF_BRANCH Update AnomaliesInsertion Anomalies (Case 1) • Correct details of branch at which the staff are to be located, must be entered

  17. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 Update AnomaliesInsertion Anomalies (Case 2) • To insert details of a new branch that has no staff allocated

  18. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesInsertion Anomalies (Case 2) • Nulls must be entered into the attributes for staff, e.g. Staff_No

  19. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesInsertion Anomalies (Case 2) • Since Staff_No is the primary key, entering nulls violates entity integrity!

  20. Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesDeletion Anomalies • If a staff is deleted from the Staff_Branch relation that represents the last member of staff at a branch, information about that branch is also lost! STAFF_BRANCH

  21. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Update AnomaliesModification Anomalies • To change the value of one of the attri-butes of a particular branch, e.g. Tel_ No. • The rows for all staffs located at that branch must be updated as well

  22. Functional Dependency: The Concept Objective 3

  23. Functional DependenciesDefinition • Describes the relationship between attributes in a relation • e.g. if A and B are attributes of a relation, B is functionally dependent on A, if each value of A is associated with exactly one value of B B is functionally B A dependent on A Attribute Attribute

  24. B is functionally B A dependent on A Attribute Attribute Functional DependenciesDeterminant • The determinant of a functional dependency refers to the attribute or group of attributes on the left-hand side of the arrow • A is the determinant of B

  25. Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • Consider the attributes Staff_No and Position of the Staff_ Branch relation STAFF_BRANCH

  26. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • For a specific Staff_No, e.g., SL21, we can determine the Position of that staff as Manager 23

  27. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • Thus, Position is functionally dependent on Staff_No

  28. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • The opposite is not true because Staff_No is not functionally dependent on Position

  29. STAFF_BRANCH Staff_No SName Position Salary Branch_No BAddress Tel_No SL21 John White Manager 30000 B5 22 Deer Rd., London 0171-8861212 SG37 Ann Beech Sen. Assist. 12000 B3 163 Main St., Glasgow 0141-3392178 SG14 David Ford Deputy 18000 B3 163 Main St., Glasgow 0141-3392178 SA9 Mary Howe Assistant 9000 B7 16 Argyll St., Aberdeen 01224-67125 SG5 Susan Brand Manager 24000 B3 163 Main St., Glasgow 0141-3392178 SL41 Julie Lee Assistant 9000 B5 22 Deer Rd., London 0171-8861212 Functional DependenciesExample • There is another member of staff which holds the position of a Manager

  30. Functional DependenciesExample Position is functionally Staff_No Position dependent on Staff_No Staff_No SL21 Manager Determinant of X X Staff_No is not functionally Position Staff_No dependent on Position Staff_No SL21 Manager Staff_No SG5 25

  31. Functional DependenciesIdentification • How to identify functional dependencies? • Analyze each attribute • e.g. Staff_No • Check to see if other attributes in the relation are functionally dependent on it • e.g. see pg. 194, Example 6.2 • Analyze all other attributes

  32. Functional DependenciesIdentification • Group all attributes which are functio-nally dependent on a particular attribute, e.g. Staff_No SName SAddress Position Salary Branch_No Tel_No BAddress Branch_No BAddress

  33. Functional DependenciesExercise • A manufacturing company distributes its products to its customers via distribution depot which keep stocks of its products • Preliminary studies indicate that the following are sample data of orders • Identify all functional dependencies

  34. P1235 203 XYZ Ent. 2 Jln. P/2E 8252211 D-012 Bangi T1209 G107 O876 TUBE GASKET O-RING 500 300 432 189.50 280.90 130.70 Sample Data RELATION NAME CUSTOMER ORDER Customer Order No. Customer No. Customer Name Customer Address Customer Tel. No. Depot No. ATTRIBUTES Depot Name Product No. Product Name Product Quantity Product Price

  35. Customer Order No. Customer No. Customer Name Customer Address Customer Tel. No. Depot No. Depot Name Product No. Product Name Product Quantity Customer No. Customer Name Customer Address Customer Tel. No. Depot No. Depot Name Product No. Product Name Product Price

  36. Unnormalized Form(UNF) Objective 1

  37. Unnormalized Form (UNF) • A table that contains one or more repeating groups • Transfer the information from the source into a table format with rows and columns • The table is in Unnormalized Form (UNF) • See pg. 201

  38. First Normal Form(1NF) Sub-Objective 2a

  39. First Normal Form(1NF) • A relation in which the intersection of each row and column contains one and only one value • To transform the UNF to 1NF we identify and remove repeating groups within the table

  40. First Normal Form(1NF) • Repeating group • A group of attributes that occurs with multiple values for a single occurrence of the nominated key attribute(s) for that table • The nominated key attribute(s) refers to the attribute(s) that uniquely identify each row within the unnormalized table

  41. First Normal Form(1NF) • Two ways to remove repeating group: • First approach • Enter appropriate data in the empty columns of rows containing the repeating data • The resulting table (relation) contains atomic values at the intersection of each row and column

  42. First Normal Form(1NF) • Second approach • Select an attribute or attributes as a key for the unnormalized table • Remove the repeating groups by placing the repeating data, along with a copy of the original key attribute(s) to a separate relation • Identify primary keys for the new relation

  43. Unnormalized FormUNF Example • Examine the DreamHome Customer Rental Details form on pg. 216, Fig. 6.6 • The information in the forms are transformed into a table format (Unnormalized Form) • The key attribute is identified as Customer _No.

  44. Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • Transform the information into a table for-mat (Unnormalized Form) CUSTOMER_RENTAL TABLE

  45. CUSTOMER_RENTAL TABLE Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • Then, identify the repeating group • (Property_No, PAddress, RentStart, RentFinish, Rent, Owner_No., OName)

  46. Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName CR76 John Kay PG4 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow PG16 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 Aline Stew PG4 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow PG36 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow PG16 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony Unnormalized FormUNF Example • These attributes repeat for each Customer_ No • e.g. there are two values for Property_No (PG4 and PG16) for customer John Kay

  47. First Normal Form (1NF) Example: First Approach • Remove the repeating group • By entering the appropriate customer data in each row • The primary key is then identified • The primary key is the composite key (Customer_No, Property_No) • The relation is in 1NF • There is a single value at the intersection of each row and column

  48. Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No ONameCustomer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName CR76 PG4 John Kay 6 Lawrence Street, 1-Jul-93 31-Aug-95 350 CO40 Tina Glasgow CR76 PG16 John Kay 5 Novar Drive, 1-Sep-95 1-Sep-96 450 CO93 Tony Glasgow CR56 PG4 Aline Stew 6 Lawrence Street, 1-Sep-92 10-Jun-93 350 CO40 Tina Glasgow CR56 PG36 Aline Stew 2 Manor Road, 10-Oct-93 1-Dec-94 375 CO93 Tony Glasgow CR56 PG16 Aline Stew 5 Novar Drive, 1-Jan-95 10-Aug-95 450 CO93 Tony First Normal Form (1NF) Example: First Approach CUSTOMER_RENTAL RELATION

  49. First Normal Form (1NF) Example: Second Approach • Remove the repeating group to a separate relation • Add a copy of the original key attribute (Customer _No) • Identify a primary key for each relation Customer (Customer_No, CName) Prop_Rental_Owner (Customer_No, Property_No, Paddress, RentStart, RentFinish,Rent, Owner_No, OName)

  50. First Normal Form (1NF) Example: Second Approach Unnormalized Form (UNF) First Normal Form (1NF) Customer_No CName Property_No PAddress RentStart RentFinish Rent Owner_No OName Which of these attributes are repeating? Remove and group repeating items in a separate relation

More Related