1 / 42

Database Administration: The Complete Guide to Practices and Procedures

Database Administration: The Complete Guide to Practices and Procedures. Chapter 3 Data Modeling & Normalization. Agenda. Data Modeling Concepts The Components of a Data Model Discovering Entities, Attributes & Relationships Conceptual, Logical & Physical Data Models Normalization

base
Download Presentation

Database Administration: The Complete Guide to Practices and Procedures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Administration:The Complete Guide to Practices and Procedures Chapter 3 Data Modeling & Normalization

  2. Agenda • Data Modeling Concepts • The Components of a Data Model • Discovering Entities, Attributes & Relationships • Conceptual, Logical & Physical Data Models • Normalization • Additional Data Modeling Issues • Questions

  3. Data Thinking • Don’t think “how”; think “what” • Don’t think physical; think conceptual • Don’t think process; think structure • Don’t think navigation; think relationship http://datatechnologytoday.wordpress.com/2010/12/06/data-modeling-concepts-every-dba-should-know/

  4. The Blind Men & The Elephant

  5. Data Modeling Objectives achieves Function Objective • Document and communicate the business information requirements • Enable databases with: • Minimum redundancy • Maximum integrity • Sharability • Stability • Flexibility • Consistency • Timely access • Usability • Increase value of data resource e.g., Data modeling Database has Database Characteristic

  6. The Components of a Data Model • Data models are comprised of multiple • components: • Entities • Entity Types • Entity Occurrences • Relationships • Attributes • Attribute Roles • Keys • Diagramming Techniques

  7. Entities • Something that exists and is capable of being described. • A person, place, thing, concept or event about which an organization maintains facts. Student Instructor Order Course Employee Item

  8. Entity Person Concept Place Thing Event Entity Examples • Person: Roles people play • Employee • Customer • Supplier • Agent • Thing: Physical object type • Item • Part • Product • Building • Material • Equipment • Concept: Intangible ideas • Warranty • Route • Account • Place:Areas or geographic locations • Office • Warehouse • City • Region • Event:Things that "happen" • • Sale • Order • Transfer • Shipment • Project • Agreement • Reservation • Flight

  9. Entity Occurrence Customer Entity Type • Customer: Moore City: Pittsburgh State: PA Phone: 555 Credit:: Go AMEX: VISA: Contact: Entity Occurrences Customer: Jackson City: An State: MI Phone: 5 Credit:: Ex AMEX: VISA: Contact: Customer: Mullins City: Houston State: TX Phone: 555-1234 Credit:: Fair AMEX: xxxxxxxxxxxxxxxxxxx VISA: xxxxxxxxxxxxxxxxxxxx Contact: Mike

  10. Entity Naming Guidelines • Noun, or adjective noun format • Contract • Lease Agreement • Minimize use of adjectives • Contractor vs Contract Employee • Training vs Employee Course • Do not pluralize; use singular instead: Employee vs Employees • The entity type is a model or patternrather than the set of all employees • Use business terms • Vendor or Supplier? • User or Client? • Be consistent • Remove process specific artifacts from the entity name • State not Residence State • Agent not Selling Agent

  11. Relationships • How the different entities are associated with each other Places Order Product Customer Is Placed By Customer Order Places Is Placed By

  12. Cardinality One to One Relationship Optional Mandatory One to Many Relationship 1 Many to Many Relationship Many Cardinality

  13. Entity-Relationship Diagramming MOVIE MovieNum STORE MOVIE COPY Name StoreNum Movie Copy Num Director is rented as employed by Manager rents / Description EMPLOYEE General Condition Address is in Star EmployeeNum Phone Rating Genre Name Rental Rate is rented under Address Phone completes SocialSecNum HireDate CUSTOMER Salary CustomerNum Supervisor (FK) MOVIE RENTAL RECORD Name Rental Record Date Address Phone rents under reports to Rental Date JoinDate Due Date CreditCardNum Rental Status receives CreditCardExDate Rental Rate StatusCode Overdue Charge Amt makes PAYMENT Payment Transaction Num Type Amount is made on PaymentDate Status

  14. Diagramming Conventions One-to-One One-to-Many Many-to-Many 1 2 3 4 5 CUSTOMER ACCOUNT STUDENT COURSE EMPLOYEE TEMP_EMP TEMP_EMP EMPLOYEE CUSTOMER ACCOUNT COURSE STUDENT STUDENT COURSE EMPLOYEE CUSTOMER TEMP_EMP ACCOUNT M 1 M M 1 1 EMPLOYEE CUSTOMER COURSE TEMP_EMP ACCOUNT STUDENT EMPLOYEE TEMP_EMP ACCOUNT STUDENT CUSTOMER COURSE (1) Ross, (2) Bachmann, (3) Martin, (4) Chen, (5) Rumbaugh.

  15. UML

  16. Data Model Diagram Guidelines • The relationship is read clockwise over the line Reads left-to-right Customer Is placed by Order Read bottom-to-top Read top-to-bottom Order Order Places Is placed by Customer Customer Read right-to-left Customer Order Places

  17. Types of Entities • Examining relationships can help to determine the type of entity: • Fundamental Entities - represent fundamental, or basic, business objects • Characteristic Entities - contain multiple attributes or facts describing a basic entity • Associative Entities - describe a relationship between two other entities

  18. STUDENT StudentID COURSE ENROLLMENT CourseNum StudentID LastName FirstName CourseNum CourseName MiddleInit Credits CourseCompDateGrade MajorID Creating an Associative Entity STUDENT COURSE StudentID CourseNum LastName CourseName FirstName Credits MiddleInit MajorID

  19. Attributes • An attribute is a fact about an entity. It is a data element that is an inherent property of an entity. • An attribute will fulfill one (and only one) of these objectives: • Identify • Relate • Describe

  20. Customer: Smith Customer #: 0001 City: New York State: New York Phone: 555-1938 Credit: Good Customer: Smith Customer #: 0001 Attributes Customer Customer Cust Number Cust Name Street Address City State Zip Code Phone Num Credit Limit Amt ENTITY = Nouns ATTR = Adjectives

  21. Domains • The domain defines the universe of valid values for an attribute • Each attribute is assigned a domain

  22. Attribute Naming Guidelines CLASS ABBREVIATION DESCRIPTION ADDRESS ADDR Address or location AMOUNT AMT Monetary amount CODE CODE Classifications, codes, types DATE DATE Calendar dates DESCRIPTION DESC Descriptive text IDENTIFIER ID Alphanumeric unique identifier IMAGE IMG Encoded digital image NAME NAME Alphabetic name, identification NUMBER NUM Numeric count PERCENT PCT Relationships as % QUANTITY QTY Counts and units TEXT TXT Free form document text TIME TIME Time; temporal data

  23. Attribute Values and Null • Attributes and attribute values can be thought of in a similar way as entities and entity occurrences • An attribute value is an instance of an attribute. • If an attribute “value” can be null, it means one of two things: • the attribute is not applicable for certain occurrences of the entity, or • the attribute applies to all entity occurrences, but the information may be unknown. • More on Nulls in the next lesson on physical database design

  24. Keys • A key is an attribute, or group of attributes, that are used to identify. • There are three types of keys: • Candidate Keys • Primary Keys • Foreign Keys • A key should contain no embedded meaning. • The key’s purpose is to identify, not to describe.

  25. Primary Keys • The attribute or attributes that uniquely identify an entity type • A primary key value uniquely identifies a single occurrence of an entity Order Product Customer Order Number + Line Number Product ID Customer Number Order Order #: 0001 Line #: 1 0001 / 1 ProdID: 1234 Qty: 5

  26. Foreign Keys (Relationship Attributes) • An attribute (or attributes) in one entity that relates an occurrence of that entity to an occurrence of another entity • The relating attribute(s) contains the same value(s) as the primary key of the related entity occurrence Employee Org Unit works in employs Emp Num Emp Name Org Unit Num (F) Org Unit Num Org Name manages Employee Org Unit managed by Emp Num Emp Name Org Unit Num Org Name

  27. Candidate Keys • There may be multiple possible primary keys • Must identify a specific occurrence of the entity • These are known as candidate keys • There should be one, and only one, primary key

  28. Discovering Entities, Attributes, and Relationships • Keep track of the nouns, noun phrases, verbs and adjectives used by subject matter experts • Nouns are potential entities & attributes, or perhaps entity occurrences & attribute values • Adjectives and prepositional phrases are usually potential attributes • Verbs highlight potential relationships • Of course, these are not hard & fast rules and you will need to dig for meaning & understanding to build an accurate data model

  29. Additional Guidance • Put the words in context to further define them • Solicit input from multiple sources • Do not rely on just a single subject matter expert • Keep patterns in mind

  30. Data Model of Modeling Terms Relationship Primary Key Entity Occurrence Attribute Domain Attribute Value

  31. Types of Data Models • Conceptual • High level • Logical • More detailed • No many-to-many relationships • Ready to move to physical • Physical • Additional DBMS details • Ready to be implemented as a database

  32. Normalization The process of identifying the one best place a fact belongs. • First Normal Form A row is in first normal form if and only if all underlying domains contain atomic values only. • Second Normal Form A row is in second normal form if and only if it is in first normal form and every non-key attribute is fully dependent on the key. • Third Normal Form A row is in third normal form if and only if it is in second normal form and every non-key attribute is non-transitively dependent on the primary key.

  33. Unnoramlized STUDENT Data

  34. First Normal Form STUDENT COURSE

  35. Second Normal Form STUDENT ENROLLMENT COURSE

  36. Third Normal Form MAJOR STUDENT ENROLLMENT COURSE

  37. The Data Model in Third Normal Form STUDENT StudentID MAJOR LastName MajorID FirstName Major MiddleInit MajorID ENROLLMENT COURSE StudentID CourseNum CourseNum CourseName Credits CourseCompDate

  38. Further Normal Forms • Boyce Codd normal form (BCNF) • A further refinement of 3NF. Indeed, in his later writings Codd refers to BCNF as 3NF. A row is in Boyce Codd normal form if and only if every determinant is a candidate key. Most entities in 3NF are already in BCNF. • Fourth normal form (4NF) • No entity can have more than a single one-to-many relationship if the one-to-many attributes are independent of each other. An entity is in 4NF if and only if it is in 3NF and has no multiple sets of multivalued dependencies. • Fifth normal form (5NF) • Every join dependency for the entity must be a consequence of its candidate keys.

  39. Normalization in Practice • Normalization ensures that we are creating a data blueprint for the best database design possible. • Normalization is a logical process. • Many times data modelers will try to mandate that the logical model become the physical database design with no changes. But it is not wise to force DBAs and technicians to forgo physical design. • In the best of all worlds, a one-to-one logical to physical translation would work, but for many reasons, this strict one-to-one mapping is not practical. • The only normal form required for relational databases is First Normal Form. 1NF ensures tabular data.

  40. Additional Data Modeling Issues • Subtypes and super-types • Handling time • One-to-one relationships? • Derived data (see next slide)

  41. Derived and Base Data Course Course Num Course Offering Date Course Name Course Fee Enrollment Student ID Course Num Course Offering Date Course Completion Date Course Charge Back Amt(DT) Student Student ID Student Name Student Date of Birth Student Age(D)

  42. Questions

More Related