450 likes | 574 Views
Improving the Quality of Database Designs. (Adapted from David Kroenke, Dabase Processing ). Improving the Quality of Database Designs. Minimizing Redundancy in Database Avoiding Anomalies Function Dependency Normal Forms First Normal Form Second Normal Form Third Normal Form
E N D
Improving the Quality of Database Designs (Adapted from David Kroenke, Dabase Processing)
Improving the Quality of Database Designs • Minimizing Redundancy in Database • Avoiding Anomalies • Function Dependency • Normal Forms • First Normal Form • Second Normal Form • Third Normal Form • Exercise Problems
Minimizing Redundancy in DB • Redundancy • Wastes space • Wastes time • Causes Anomalies (incorrect data)
Avoiding Anomalies • Causes • Update Anomaly • Insertion Anomaly • Deletion Anomaly
Back to UA Back to DA DVD Table Back to IA
Update Anomaly • Situation in whichUpdate in one record requires update in another record. • E.g.Suppose for dvdID #150 (Elizabeth),length is changed to 100.If length values in devID #157 and #125 are not changed also, we have anomalies.To DVD
Insertion Anomaly • Situation in whichAdding a record results in an inconsistency • Suppose another copy of The 39 Steps is added to the table. If its values of genre, length, and rating are not the same as those dvdID #120, we have an anomaly.To DVD
Deletion Anomaly • Situation in whichDeleting one record results in unintended loss of data • Suppose dvdID #172 is removed. Then all data items regarding studio DEF and its country (Poland) —will be lost.To DVD
Functional Dependence • Definition • Given: A and B are attributes of relation (table) R Then B is functionally dependent on A if and only if each value in A has associated with it exactly one value of B in R. • A B ( A determines B) • I.e., any 2 rows with same value for A will have the same value for B
Functional Dependence (1) • DVD (title, publisher, length, director, pubAddress) • publisher pubAddress (yes) • title length (no) • title, publisher length (yes) Back to 2NF
Functional Dependence (2) • Books (bkID, ISBN, title, author, pubAddress) • ISBN title (yes) • ISBN author (yes) • bkID title (yes) • bkID author (yes) • bkID pubAddress (yes) • title, publisher length (yes) • A primary key determine each nonkey attribute
First Normal Form (1NF) • A relation (table) is in 1NF if • Each row is unique (with primary key) • All attributes are atomic
Second Normal Form (2NF) • A relation (table) is in second normal form if • All nonkey attributes are dependent on all of the key. (This means that the relation is not in 2NF if any nonkey attribute is dependent on only part of the key.) • E.g., in DVD, length is dependent only on title, but not on publisher. To FD1
Back to Problems 2NF? (No) StudentdActivities
Problems • Note • Key: stdID + activities • Attribute fee is dependent only on activities (partial key). • Problems • There are obvious redundancies. • If student 175 is removed, fee($50) for Squash is deleted. • A new activity—say Surfing—cannot be entered until a student is entered To 2NF
Solution • Remove the attribute that is dependent only on part of the key and form a new table • Create a link between the new and the original tables using a foreign key • Note: if a relation (table) is 1NF and the primary key consists of a single attribute, the relation is automatically 2NF.
Solution Activities Fees
Third Normal Form (3NF) • A relation is in 3NF if • It is in 2NF and • There are no transitive dependencies. (I.e., every nonkey attribute is dependent only on the primary key.) • Table satisfying 3NF (in common terms) • Should have a field that uniquely identifies each record • Each field in the table should describe the subject that the table represents
3NF? (No) StudentHousing Back to Problems
Transitive Dependence • stdID building (I.e., building is dependent on stdID) • building fee (I.e., fee is dependent on building) • Thus,stdID building fee
Problems • StdHousing is in 2NF, but • Redundant data will introduce modification anomaly • Removing stdID 150 deletes fee value for Ingersoll • Fee for a new building—say Barrett—cannot be recorded until a new stdID is enteredTo 3NF
Solution • Remove data that is not dependent on primary key and form new relation • Create a relationship between the new and the original tables using foreign key
Solution StudentResidence ResidenceFee
Try This (Customers Table) Back to Problem
Problem • Note that • custNum ZIP ZIP city, stateI.e., custNum ZIP city, state • Transitive dependence results in redundancy and modification, insertion, & deletion anomalies. To Customers
Summary • Examine the attributes of an entity and ask the following questions. If the answer is any “Yes,” an attribute probably belong to another entity. • Does an attribute or attributes describe an entity other than the current one? • Does an attribute of the entity depend (functionally dependent) on only part of the primary key? • Does an attribute depend on something other than the primary key?
Company Database Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempPositionDescripempDateHireempPayRateempDateLastRaise custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDateprodIdprodDescripprodCost Products prodIdprodDescripprodCost
Company Database (2) Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPagerempPositionempDateHireempPayRateempDateLastRaise EmployeePaysempIdempPositionempPositionDescripempDateHireempPayRateempDateLastRaise
Company Database Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFax Customers custIdcustNamecustAddresscustCitycustStatecustZipcustPhonecustFaxorderNumorderQuantityorderDate Orders custIdorderNumorderQuantityorderDate
Quiz • Normalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F) • Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F) • If data is in the third normal form, it is automatically in the first and second normal forms. (T/F) • What is the major advantage of denormalized database versus a normalized database? • What are some major disadvantages of unnormalized database?
Customers custIdcustNamecustAddresscustcitycustStatecustZipcustPhonecustFax EmployeePays empPayIdempDateHireempPayRateempDateLastRaise Employees empIdempLastNameempFirstNameempMiddleNameempAddressempCityempStateempZipempPhoneempPager Positions os_idpositionposition_descrip Orders orderNumorderQuantityorderDate Exercise:What Type of Relationships Do the Tables Have?
Exercise:Normalize the following data. Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here. Employees: Angela Smith, secretary, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669 Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342 Customers: Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A CustomerOrders:Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661
Tables Customers Employees Orders SsnlastNamefirstNamestreetcitystatezipphoneNumsalaryhourlyRatestartDateposition customerIDnamestreetcitystatezipphoneNum orderIDcustomerIDproductIDproductDescripdateOrdered
Normalization Case Study • A database named “Movie Rentals” keeps track of which customer checked which movies.
Rentals Table • 1NF • Each cell value is atomic • No repeating data fields • Is the table in 1NF?
To Satisfy 1NF Requirements (1) Lets simplify the primary key
To Satisfy 1NF Requirements (2) Make “movieTitle” and “category” atomic. (Note that fullName has also been split.)
To Satisfy 1NF Requirements (3) Eliminate “repeating” phone data. Customers Phones Rentals
To Satisfy 2NF Requirements (1) • 2NF • Is in 1NF • Each nonkey field value depends on the entire primary key
To Satisfy 2NF Requirements (2) • Note that 2NF is a concern only with composite primary key.
To Satisfy 3NF Requirements (1) • For 3NF • In 2NF • All nonkey field values depend only on primary key (i.e., no transitive dependency)
To Satisfy 3NF Requirements (2) • In Customers table • Cust → nameFirst → title. I.e., tile depends on nameFirst
To Satisfy 3NF Requirements (3) Titles Customers
Titles Customers 1 N 1 1 Rentals N Phones N