1 / 99

Session 8-9 Data Resource Management

Session 8-9 Data Resource Management. PART Ⅱ Introduction to the Foundation of Information Technologies. Content. Database Concepts & Technology Experiment: ACCESS Database Trends Data Trends of Application Data Warehouse OLAP DATA Mining Creating Database Environment. DATA.

tate-dennis
Download Presentation

Session 8-9 Data Resource Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 8-9 Data Resource Management PART Ⅱ Introduction to the Foundation of Information Technologies HUANG Lihua, Fudan University

  2. Content • Database Concepts & Technology • Experiment: ACCESS • Database Trends • Data Trends of Application • Data Warehouse • OLAP • DATA Mining • Creating Database Environment HUANG Lihua, Fudan University

  3. DATA • Streams of raw facts representing events such as business transactions, simple observations of the state of the world HUANG Lihua, Fudan University

  4. FILE ORGANIZATION • A computer system organizes data in a hierarchy that begins with bits, and proceeds to bytes, fields, records, files, and database. File Record Record Field Field Field Filed Byte Byte Byte Byte Bit Bit Bit Bit HUANG Lihua, Fudan University

  5. FILE ORGANIZATION • BIT: Binary Digit (0,1;Y,N;On, Off) • BYTE: Combination of BITS which represent a CHARACTER • FIELD: A logical grouping of characters into a word, a group of words, or a complete number. • RECORD: Collection of FIELDS which reflect a TRANSACTION • FILE: A Collection of Similar RECORDS • DATABASE: An Organization’s Electronic Library of FILES HUANG Lihua, Fudan University

  6. FILE ORGANIZATION For Example, • Filed: Student’s name; • Record: A record • File: HUANG Lihua, Fudan University

  7. FILE ORGANIZATION : Another way of thinking about database components—— • ENTITY: Person, Place, Thing, Event about Which Data Must be Kept ( a record describes an entity ) • ATTRIBUTE: Description of a Particular ENTITY (corresponds to fields) • KEY FIELD: Field Used to Retrieve, Update, Sort RECORD * HUANG Lihua, Fudan University

  8. Record FILE ORGANIZATION Key Field Attribute Key Field File HUANG Lihua, Fudan University

  9. KEY FIELD • Field in Each Record • Uniquely Identifies THIS Record • For RETRIEVAL UPDATING SORTING * HUANG Lihua, Fudan University

  10. Accessing Records from Computer Files:Sequential vs. Direct or random file organization • SEQUENTIAL: Data records must be retrieved in the same physical sequence in which they are stored. (Magnetic tape ) • DIRECT: Data can be accessed without regard to physical sequence. (Disk) * Sequential file organization Direct file organization HUANG Lihua, Fudan University

  11. Class file Class programs registration Class file Accounts programs accounting Class file Sports programs athletics Traditional File Processing & File Organization HUANG Lihua, Fudan University

  12. Traditional File Processing & File Organization HUANG Lihua, Fudan University

  13. Flat File Problems Arising from the File Organization • Data Redundancy: The same piece of information could be duplicated in several files. • Data Inconsistency: • Data Isolation: Data files are likely to be organized differently, stored in different formats, and often physically inaccessible to other applications. • data integrity problem: It is difficult to place data integrity constraints across multiple data files. • Application and Data Independence: In the file environment, the applications and their associated data files are dependent on each other. • Poor security: is difficult to enforce in the file environment. • Lack of data sharing & availability HUANG Lihua, Fudan University

  14. Problems Arising from the File Organization • Data Redundancy; • Data Inconsistency; • Data Isolation, data integrity problem; • Application and Data Independence; • Security, data sharing problem. These problems led to the development of DATABASE HUANG Lihua, Fudan University

  15. DATABASE • A Database is an organized logical grouping of related files. • In a Database, data are stored & managed in a convenient form, and integrated and related so that one set of software programs provides access to all the data. HUANG Lihua, Fudan University

  16. DATABASE • Collection of centralized data • Controls redundant data • Data stored so as to appear to users in one location • Services multiple application HUANG Lihua, Fudan University

  17. DATABASE MANAGEMENT SYSTEM (数据库管理系统DBMS) • Software to create & maintain DATA enables business applications to extract data independent of specific computer programs. HUANG Lihua, Fudan University

  18. Computer based files of this type cause problems such as redundancy, inconsistency, and data isolation. Class programs registrar Database Class file Accounts file Sports file DBMS accounting Accounts programs Sports programs DBMS provides access to all data in the database athletics Class file Class programs registrar Class file Accounts programs accounting Class file Sports programs athletics HUANG Lihua, Fudan University

  19. Database Environment HUANG Lihua, Fudan University

  20. COMPONENTS OF DBMS: • DATA DEFINITION LANGUAGE: • Defines Data Elements in Database • DATA MANIPULATION LANGUAGE: • Manipulates Data for Applications • For Example: For extracting data from database, e.g. SQL • DATA DICTIONARY: • Formal Definitions of all Variables in Database; Controls Variety of Database Contents * HUANG Lihua, Fudan University

  21. Sample data dictionary report HUANG Lihua, Fudan University

  22. Fundamental Database Structures HUANG Lihua, Fudan University

  23. HIERARCHICAL DATABASE ROOT Employer Job Benefits Compensation FIRST CHILD Assignments 2nd CHILD Ratings Salary Insurance Health Pension HUANG Lihua, Fudan University

  24. STUDENT ID CLASS ONE-TO-MANY: STUDENT A STUDENT B STUDENT C COURSE 1 COURSE 2 MANY-TO-MANY: STUDENT A STUDENT B STUDENT C Type of RELATIONS ONE-TO-ONE: HUANG Lihua, Fudan University

  25. COURSE 1 COURSE 2 STUDENT A STUDENT B STUDENT C NETWORK DATA MODEL • Variation of Hierarchical Model • Useful for many-to-many relationships HUANG Lihua, Fudan University

  26. Disadvantages of Hierarchical and Network DBMS • Outdated • Less flexible compared to RDBMS • Lack support for ad-hoc and English language-like queries HUANG Lihua, Fudan University

  27. RELATIONAL DATA MODEL • DATA IN TABLE FORMAT • RELATION: TABLE • Tuple(元组): ROW (record 记录) IN TABLE • Field: COLUMN (attribute 属性) IN TABLE * HUANG Lihua, Fudan University

  28. Comp. name sales assets netincome empls indcode yr allied boeing ... indcode indname 42 44 ... Example DB: Fortune 500 Companies • company • industry codes 9115000 13271000 -279000 143800 37 85 9035000 7593000 292000 95700 37 82 pharmaceuticals computers HUANG Lihua, Fudan University

  29. The Relational Data Model HUANG Lihua, Fudan University

  30. Current DBMS: Relational Database • DBMS Vendor • MS: Access, SQL Server • Oracle • Sybase • DB2 • Informix • MySQL HUANG Lihua, Fudan University

  31. The Relational Database Model • The relational model is based on a simple concept of tables in order to capitalize on characteristics of rows and columns of data, which is consistent with real-world business situations. • One of the greatest advantages of the relational model is its conceptual simplicity and the ability to link records in a way that is not predefined. HUANG Lihua, Fudan University

  32. The Relational Abstraction • Information is in tables • Also called (base) relations • Columns define attributes(属性、字段、数据项) • Also called fields or domains • Rows define records • Also called tuples(元组) • Cells contain values • All cells in column have information of same type • e.g., integer, floating point, text, date HUANG Lihua, Fudan University

  33. Operations on Tables • Add new rows (or sometimes columns) • Modify existing rows • Choose a subset of columns • Choose a subset of rows • Combine rows (e.g., sum values in a column) • Combine columns • Combine two tables (join) • No operations to combine individual cells • Unlike spreadsheet HUANG Lihua, Fudan University

  34. Three Basic Operations in a Relational Database • Select: • Creates subset of rows that meet specific criteria • Join: • Combines relational tables to provide users with information • Project: • Enables users to create new tables containing only relevant information HUANG Lihua, Fudan University

  35. The three basic operations of a relational DBMS HUANG Lihua, Fudan University

  36. Operating on Databases: SQL • Every abstraction needs an interface through which users invoke abstract operations • graphical interface • language • Structured Query Language • Select …(content)…. From… (table)..Where (condition) • We'll focus only on queries • Query = question • Extract some data from one or more tables to answer a particular question HUANG Lihua, Fudan University

  37. Physical vs. Logical Data View • Minimizes these problems by providing two “views” of the database data: • The physical view deals with the actual, physical arrangement and location of data in the direct access storage devices (DASD). • The logical view, or user’s view, represents data in a format that is meaningful to a user and to the software programs that process that data. • Entity-relationship diagram (ER图): Methodology for documenting databases illustrating relationships between database entities • Normalization (范式): Process of creating small stable data structures from complex groups of data HUANG Lihua, Fudan University

  38. Entity-relationship diagram HUANG Lihua, Fudan University

  39. Experiment: Microsoft Access • Features: • Create/Modify databases • Specify/Run queries • Design/Print reports • Design graphical user interfaces around databases • Forms for entering, viewing data • Assignment: P136 APP. Exer 3 P.169 App. Exer 1 HUANG Lihua, Fudan University

  40. Content • Database Concepts & Technology • Experiment: ACCESS • Database Trends • Data Trends of Application • Data Warehouse • OLAP • DATA Mining • Creating Database Environment HUANG Lihua, Fudan University

  41. 2. Database Trends(1) • The evolution of Database System • Data • Simple data => Multimedia data, Knowledge • Model • Relational model => OO model Object relational model HUANG Lihua, Fudan University

  42. Databases Trends (2) • Application • OLTP => OLAP • Data organization • Database => Data warehouse, Data Marts • Query language • SQL => Deductive HUANG Lihua, Fudan University

  43. Emerging Database Models The most common database models are: • Multimedia database • Deductive databases • Object-oriented databases • Multimedia and hypermedia databases • Multidimensional Database HUANG Lihua, Fudan University

  44. Object-Oriented Database Model • Object-oriented (OO) databases store both data and procedures acting on the data, as objects. • Encapsulation Capability • The OO database can be particularly helpful in multimedia environments, such as in manufacturing sites using CAD/CAM. • OO databases can be particularly useful in supporting temporal and spatial(时空) dimensions. • Terminology in the OO model includes: • objects, attributes, classes, methods, and messages. HUANG Lihua, Fudan University

  45. Hypermedia Database Model • The hypermedia database model stores chunks of information in the form of nodes connected by links established by the user. • The nodes can contain text, graphics, sound, full-motion video, or executable computer programs. • Users can branch to related information in any kind of relationship. HUANG Lihua, Fudan University

  46. A hypermedia database HUANG Lihua, Fudan University

  47. Multidimensional Database • A variation of the relational model • Use multidimensional structures to organize data and express the relationship between data. • A dimension of the data : a side of a cube. • ①多维数组 • (北京,1999年,彩电,10000) • (地理位置,年份,产品类型,销售额) • ②维的层次 • 例如:年、季度、月份、日期 • 国家、地区、省、城市 • ③维内元素的类 • 例如: 按产品的价格分成高、中、低档。 • 按原材料的成本价格分类 HUANG Lihua, Fudan University

  48. Multidimensional data model HUANG Lihua, Fudan University

  49. Specialized Databases • There are many specialized databases, depending on the type or format of data stored. • A geographical information database contains locational data for overlaying on maps or images. • A knowledge database stores decision rules used to evaluate situations and help users make decisions like an expert. • A multimedia database stores data on many media—sounds, video, images, graphic animation, and text. HUANG Lihua, Fudan University

  50. Content • Database Concepts & Technology • Experiment: ACCESS • Database Trends • Data Trends of Application • Data Warehouse • OLAP • DATA Mining • Creating Database Environment HUANG Lihua, Fudan University

More Related