330 likes | 873 Views
Normalization. What is Normalization ? Normalization Levels First Normal Form Second Normal Form Third Normal Formal Referential Integrity. What Is Normalization?. Definition process of reducing data redundancy in a relational database Process
E N D
Normalization • What is Normalization? • Normalization Levels • First Normal Form • Second Normal Form • Third Normal Formal • Referential Integrity
What Is Normalization? • Definition • process of reducing data redundancy in a relational database • Process • by organizing data into tables of various Normal Forms • Benefits • greater organization of database • reduction of redundant data • data consistency in database • more flexible database design • better handle on database security
Normal Forms • Definition • way of measuring the extent to which database has been constrained (to reduce redundancy) • Levels of Normalization • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF)
First Normal Form • Informal Description • Each record is unique • All attribute values are atomic • Objective of 1NF is • to divide the database into tables • so that each row is unique (with primary key) • and each field is atomic (single item per cell) • What about this? (Employees Table) • lastName: TomfirstName: Smithaddress: 123 Apple St.city: San Francisco
Company_Database emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepagerpositiondate_hirepay_ratedate_last_raise cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_faxorder_numquantityorder_dateprod_idprod_descripcost (Examples in this section are adapted from R. Stephens & R. Plew, Teach Yourself SQL in 24 Hours, SAMS, p. 47-49)
Company_Database Customers cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_faxorder_numquantityorder_date Employees emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepagerpositionposition_descripdate_hirepay_ratedate_last_raise emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepagerpositionposition_descripdate_hirepay_ratedate_last_raise cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_faxorder_numquantityorder_dateprod_idprod_descripcost Products prod_idprod_descripcost
Second Normal Form • Informal Description • All of the strictly informational attributes are attributes of the entities in the table scheme • Objective of 2NF is • to take the data items that are only partly dependent on primary key • and forming a separate table with them • What about this? (Customers Table) • custID: 12345custLName: ChewcustFName: MaryorderItem: refrigeratororderQuant: 2
Company_Database Employees emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepager Employees emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepagerpositiondate_hirepay_ratedata_last_raise Employee_Paysemp_idpositionposition_descripdate_hirepay_ratedate_last_raise
Company_Database Customers cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_fax Customers cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_faxorder_numquantityorder_date Orders cust_idorder_numquantityorder_date
Third Normal Form • Informal Description • Strictly informational attributes depends only on a primary key • Objective of 3NF is • to take data items in a table that are not dependent on the primary key • and form a separate table with them • What about this?
Company_Database Employee_Paysemp_iddate_hirepay_ratedate_last_raise Employee_Paysemp_idpositiondate_hirepay_ratedata_last_raise Positionspos_idpositionposition_descrip
Quiz • Normalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F) • Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F) • If data is in the third normal form, it is automatically in the first and second normal forms. (T/F) • What is the major advantage of denormalized database versus a normalized database? • What are some major disadvantages of denormalization?
Exercise:What Type of Relationships Do the Tables Have? Customers cust_idcust_namecust_addresscust_citycust_statecust_zipcust_phonecust_fax Employee_Payspay_iddate_hirepay_ratedate_last_raise Employees emp_idlast_namefirst_namemiddle_nameaddresscitystatezipphonepager Positionspos_idpositionposition_descrip Orders order_numquantityorder_date
Exercise:Normalize the following data. Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here. Employees: Angela Smith, secretary, 317-545-65879, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669 Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342 Customers: Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A CustomerOrders:Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661
Solutions: Employees Customers Orders customerIDnamestreetcitystatezipphoneNum orderIDcustomerIDproductIDproductDescripdateOrdered SsnnamestreetcitystatezipphoneNumsalaryhourlyRatestartDateposition
Functional Dependency • Definition • If A and B are attributes of relation R, then B is functionally dependent on A if and only if each value in R has associated with it exactly one value of B in R. • A B ( A determines B)
stuIDS1001S1003S1006S1010S1060 stuNameSmith, TomJones, MaryLee, PamelaBurns, EdwardJones, Mary MajorHistoryMathCISArtCIS Credits9095156325 statusSenSenFreshJunFresh SSN100429500010124567088520876099320985064624738 Student(stuID, stuName, major, credit, status, SSN) stuID stuNamestuID stuName, major, credit,s status, SSNSSN stuID, stuName, major, credits, status, SSNcredits status status credits (Not true)
Classes(course#, stuID, stuName, facID, sched, room, grade) Course#ART103AART103AART103ACIS201ACIS201AHST205A stuIDS1001S1010S1006S1003S1006S1001 stuNameSmith, TomBurns, EdwardLee, PamelaJones, MaryLee, PamelaSmith, Tom facIDF101F101F101F105F105F202 schedMWF9MWF9MWF9TH10TH10MWF11 RoomH221H221H221M110M110H221 gradeABAC course#, stuID stuName, facID, sched, room, gradecourse# facID, sched, roomstuID stuName
Second Normal Form (2NF) • A relation is 2NF if & only if it is in first normal form & all non-key attributes are fully functionally dependent on the key • Note: if R is 1NF and the key consists of a single attribute, the relation is automatically 2NF.
Full Functional Dependence • In a relation R, attribute B or R is fully functionally dependent on an attribute or set of attributes A of R, if B is functionally dependent on A but not functionally dependent on any proper subset of A
Classes(course#, stuID, stuName, facID, sched, room, grade) course#, stuID stuName, facID, sched, room, gradecourse# facID, sched, roomstuID stuName classes2(course#, stuID, grade)course(course#, facID, sched, room)students(stuID, stuName)
Third Normal Form (3NF) • A relation R is 3NF if it is 2NF and no non-key attribute is transitively dependent on the key. • Transitive dependence A B C stdID credits status
student(stuID, stuName, major, credit, status) stuID credits status stuID status students2(stuID, stuName, major, credits)stats(credits, status)