1 / 127

Database Overview

Database Overview. File Management vs Database Management Advantages of Database systems: storage persistence, programming interface, transaction management Three level Data Model DBMS Architecture Database System Components Users classification. File Management System Problems.

Download Presentation

Database Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Overview • File Management vs Database Management • Advantages of Database systems: storage persistence, programming interface, transaction management • Three level Data Model • DBMS Architecture • Database System Components • Users classification

  2. File Management System Problems • Data redundancy • Data Access: New request-new program • Data is not isolated from the access implementation • Format incompatible data • Concurrent program execution on the same file • Difficulties with security enforcement • Integrity issues

  3. Advantages of Databases • Persistent Storage – Database not only provides persistent storage but also efficient access to large amounts of data • Programming Interface – Database allows users to access and modify data using powerful query language. It provides flexibility in data management • Transaction Management – Database supports a concurrent access to the data

  4. Three Aspects to Studying DBMS's 1. Modeling and design of databases. • Allows exploration of issues before committing to an implementation. 2. Programming: queries and DB operations like update. • SQL = “intergalactic dataspeak.” 3. DBMS implementation. .

  5. Definitions • A database is a collection of stored operational data used by various applications and/or users by some particular enterprise or by a set of outside authorized applications and authorized users • A DataBase Management System (DBMS) is a software system that manages execution of users applications to access and modify database data so that the data security, data integrity, and data reliability is guaranteed for each application and each application is written with an assumption that it is the only application active in the database.

  6. Three Level Data View –Data Abstractions . . . . . View1 View k Conceptual View Of Data Phyisal Data Storage

  7. DBMS Architecture

  8. Logical and Physical Database Components Logical • Data Definition Language (DDL) • Data Manipulation Language (DML) • Host Language Interface • Data Administrator • Users • Query Processor • Compiler • Optimizer • Management • Transaction Manager • File Manager • Buffer Manager • Authorization and Integrity Manager Physical

  9. Database Languages Department Faculty Dept Chair Name Dept SELECT Chair FROM Faculty, DepartmentWHERE Faculty.name = “Ken Noname” AND Faculty.Dept = Department.Dept Data definition language (DDL) ~ like type definitions in C or C++ Data Manipulation Language (DML) Query (SELECT) UPDATE < relation name > SET <attribute> = < new-value> WHERE <condition> SQL

  10. Logical Data Models • Acollectionoftoolsfordescribing • data • datarelationships • datasemantics • dataconstraints • Valuebasedmodels: ERModel, OOModel • Record Based Models: Relational Model

  11. Entity-Relationship Model • The enterprise data can be described as a set of entities and a set of relationships between them. • Entity – a data that pertains to, or describes some component of the enterprise • Each entity is characterized by a set of attributes • Relationship – describes an interconnection between different entities • Entity Set – a set of entities that are characterized by the same entity definition • Relationship Set – a set of relationships of the same type

  12. Entity-Relationship Model Example of schema in the entity-relationship model

  13. Relational Model • An enterprise is represented as a set of relations • Domain – is a set of atomic values. Each domain has a NULL value. • Data type – Description of a form that domain values can be represented. • Relation is a subset of a cartesian product of one or more domains • The elements of relations are called tuples. Each element in the cartesian product is called attribute.

  14. Relational model is good for: Large amounts of data —> simple operations Navigate among small number of relations Difficult Applications for relational model: • VLSI Design (CAD in general) • CASE • Graphical Data ALU ADDER CPU A FA Adder ALU ADDER Bill of Materials or transitive closure

  15. Relational Model Attributes Street City gpa Name Student-id • Example of tabular data in the relational model Johnson Smith Johnson Jones Smith 192-83-7465 019-28-3746 192-83-7465 321-12-3123 019-28-3746 Alma North Alma Main North 3.6 2.7 3.2 4.0 3.45 Palo Alto Rye Palo Alto Harrison Rye

  16. Relational Algebra Lecture 2

  17. Relational Model • Basic Notions • Fundamental Relational Algebra Operations • Additional Relational Algebra Operations • Extended Relational Algebra Operations • Null Values • Modification of the Database • Views • Bags and Bag operations

  18. Basic Structure • Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x DnThus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di • Example: customer_name = {Jones, Smith, Curry, Lindsay}customer_street = {Main, North, Park}customer_city = {Harrison, Rye, Pittsfield}Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name , customer_street, customer_city

  19. Relation Schema • A1, A2, …, Anare attributes • R = (A1, A2, …, An ) is a relation schema Example: Customer_schema = (customer_name, customer_street, customer_city) • r(R) is a relation on the relation schema R Example: customer (Customer_schema)

  20. Database • A database consists of multiple relations • Information about an enterprise is broken up into parts, with each relation storing one part of the information account : stores information about accountsdepositor : stores information about which customer owns which account customer : stores information about customers • Storing all information as a single relation such as bank(account_number, balance, customer_name, ..)results in repetition of information (e.g., two customers own an account) andthe need for null values (e.g., represent a customer without an account)

  21. Keys • Let K  R • K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) • by “possible r ” we mean a relation r that could exist in the enterprise we are modeling. • Example: {customer_name, customer_street} and {customer_name} are both superkeys of Customer, if no two customers can possibly have the same name. • K is a candidate key if K is minimalExample: {customer_name} is a candidate key for. • Primary Key

  22. Select Operation – Example A B C D • Relation r         1 5 12 23 7 7 3 10 • A=B ^ D > 5(r) A B C D     1 23 7 10

  23. Project Operation – Example A B C • Relation r:     10 20 30 40 1 1 1 2 A C A C • A,C (r) That is, the projection of a relation on a set of attributes is a set of tuples     1 1 1 2    1 1 2 =

  24. Union Operation – Example A B A B    1 2 1   2 3 • Relations r, s: s r A B     1 2 1 3 r  s:

  25. Set Difference Operation – Example A B A B    1 2 1   2 3 • Relations r, s: s r A B   1 1 r – s:

  26. Cartesian-Product Operation-Example A B C D E Relations r, s:   1 2     10 10 20 10 a a b b r s r xs: A B C D E         1 1 1 1 2 2 2 2         10 10 20 10 10 10 20 10 a a b b a a b b

  27. Additional Operations We define additional operations that do not add any power to the relational algebra, but that simplify common queries. • Set intersection • Natural join • Division • Assignment

  28. Set-Intersection Operation - Example A B • Relation r, s: • r  s A B    1 2 1   2 3 r s A B  2

  29. r s Natural Join Operation – Example B D E A B C D • Relations r, s: 1 3 1 2 3 a a a b b           1 2 4 1 2      a a b a b r s A B C D E      1 1 1 1 2      a a a a b     

  30. Division Operation – Example A B Relations r, s: B            1 2 3 1 1 1 3 4 6 1 2 1 2 s r s: A r  

  31. Another Division Example Relations r, s: A B C D E D E         a a a a a a a a         a a b a b a b b 1 1 1 1 3 1 1 1 a b 1 1 s r A B C r s:   a a  

  32. Example Queries • Find the largest account balance 1. Rename account relation as d 2. The query is: balance(account) - account.balance (account.balance < d.balance(account x rd (account)))

  33. customer-name, branch-name(depositoraccount)  branch-name (branch-city = “Brooklyn” (branch)) Example Queries • Find all customers who have an account at all branches located in Brooklyn city.

  34. Extended Relational-Algebra-Operations • Generalized Projection • Outer Join • Aggregate Functions

  35. Generalized Projection • Extends the projection operation by allowing arithmetic functions to be used in the projection list.F1, F2, …, Fn(E) • E is any relational-algebra expression • Each of F1, F2, …, Fn are are arithmetic expressions involving constants and attributes in the schema of E. • Given relation credit-info(customer-name, limit, credit-balance), find how much more each person can spend: customer-name, limit – credit-balance (credit-info)

  36. Aggregate Functions and Operations • Aggregation function takes a collection of values and returns a single value as a result. avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values • Aggregate operation in relational algebra G1, G2, …, GngF1( A1), F2( A2),…, Fn( An) (E) • E is any relational-algebra expression • G1, G2 …, Gn is a list of attributes on which to group (can be empty) • Each Fiis an aggregate function • Each Aiis an attribute name

  37. Aggregate Operation – Example A B C • Relation r:         7 7 3 10 sum-C gsum(c)(r) 27

  38. Aggregate Operation – Example • Relation account grouped by branch-name: branch-name account-number balance Perryridge Perryridge Brighton Brighton Redwood A-102 A-201 A-217 A-215 A-222 400 900 750 750 700 branch-nameg sum(balance) (account) branch-name balance Perryridge Brighton Redwood 1300 1500 700

  39. Aggregate Functions • Result of aggregation does not have a name • Can use rename operation to give it a name • For convenience, we permit renaming as part of aggregate operation branch-nameg sum(balance) as sum-balance (account)

  40. branch-name loan-number amount Downtown Redwood Perryridge L-170 L-230 L-260 3000 4000 1700 customer-name loan-number Jones Smith Hayes L-170 L-230 L-155 Outer Join – Example • Relation loan • Employee(ename,str,city) • Works(ename,cname,sal) • Company(cname,city) • Manages(ename,mname) • Relation borrower

  41. SQLLecture 3

  42. SQL • Data Definition • Basic Query Structure • Set Operations • Aggregate Functions • Null Values • Nested Subqueries • Complex Queries • Views

  43. Data Definition Language Allows the specification of not only a set of relations but also information about each relation, including: • The schema for each relation. • The domain of values associated with each attribute. • Integrity constraints • The set of indices to be maintained for each relations. • Security and authorization information for each relation. • The physical storage structure of each relation on disk.

  44. Basic Query Structure • A typical SQL query has the form:select A1, A2, ..., Anfromr1, r2, ..., rmwhere P order by group by having Q • Ais represent attributes • ris represent relations • P is a predicate. • This query is equivalent to the relational algebra expression. A1, A2, ..., An(P (r1 x r2 x ... x rm)) • The result of an SQL query is a relation.

  45. Set Operations (selectcustomer-name from depositor)union (selectcustomer-name from borrower) • Find all customers who have a loan, an account, or both: • Find all customers who have both a loan and an account. (selectcustomer-name from depositor)intersect (selectcustomer-name from borrower) • Find all customers who have an account but no loan. (selectcustomer-name from depositor)minus (selectcustomer-name from borrower)

  46. Aggregate Functions • These functions operate on the multiset of values of a column of a relation, and return a value avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values

  47. Null Values and Aggregates • Total all loan amounts select sum (amount)from loan • Above statement ignores null amounts • result is null if there is no non-null amount, that is the • All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes.

  48. Nested Subqueries • SQL provides a mechanism for the nesting of subqueries. • A subquery is a select-from-where expression that is nested within another query. • A common use of subqueries is to perform tests for set membership, set comparisons, and set cardinality.

  49. Set Comparison select distinct T.branch-namefrom branch as T, branch as Swhere T.assets > S.assets andS.branch-city = ‘Brooklyn’ • Find all branches that have greater assets than some branch located in Brooklyn. • Same query using > some clause select branch-namefrom branchwhere assets > some (select assetsfrom branchwhere branch-city = ‘Brooklyn’)

  50. Example Query • Find the names of all branches that have greater assets than all branches located in Brooklyn. select branch-namefrom branchwhere assets > all (select assetsfrom branchwhere branch-city = ‘Brooklyn’)

More Related