1 / 24

Business Intelligence and Data warehousing

Business Intelligence and Data warehousing. BY ISABIRYE COLLIN ROBBERT cisabirye@technology.ucu.ac.ug 0772613892. Course aims and objectives. Give students the understanding on the role and operation of data warehouses To equip students with skills of developing data warehouses

Download Presentation

Business Intelligence and Data warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Business Intelligence and Data warehousing BY ISABIRYE COLLIN ROBBERT cisabirye@technology.ucu.ac.ug 0772613892

  2. Course aims and objectives • Give students the understanding on the role and operation of data warehouses • To equip students with skills of developing data warehouses • To equip students with skills of maintaining existing data warehouses • To equip students with skills of manipulating data warehouses to generate information for business decision making

  3. Learning Outcomes By the end of the course the students will be able to understand: • the importance of DWH and BI to an organization. • components of a data warehouse. • different forms of business intelligence and how they can help in business decision-making. • The process of developing a DWH

  4. Reference Books • Data warehousing fundamentals by Paulraj Ponniah, 2001, ISBN: 978-0-471-41254. • Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang

  5. Course Outline • Week 1: Introduction to Business Intelligence , data warehouses and Corporate Information • Week 2: Data warehouse architectures • Week 3: Data warehouse lifecycle and project management • Week 4: Introduction to Dimensional Modeling • Week 5: Test 1 • Week 6: Building dimensional models • Week 7: Implementation of the data warehouse component • Week 8: Physical design, Indexing, Physical storage • Week 9: Data extraction, transformation and loading (ETL) • Week 10: Test 2 • Week 11: End User applications and Online analytical processing • Week 12: Infrastructure and Metadata

  6. Introduction • Data • Items that are the most elementary descriptions of things, events, activities, and transactions • May be internal or external • Information • Organized data that has meaning and value • Knowledge • Processed data or information that conveys understanding or learning applicable to a problem or activity • Knowledge is a familiarity with someone or something, that can include facts, descriptions, information, and/or skills acquired through experience or education

  7. Introduction • Over the years, storage and management of data from various operational systems has become a great challenge. • Long-term strategic planning has become increasingly important in the modern global market. • For this reason, companies have worked towards: • Access to information at all levels • Survival and prosper in a competitive world. • The focus of technology shifted from data input and capture through the operational systems to information access and availability.

  8. What is Business Intelligence (BI)? • All processes, techniques, and tools that support business decision making based on information technology. • Businessintelligence (BI) mainly refers to computer-based techniques used in identifying, extracting,and analyzing business data, such as sales revenue by products and/or departments, or by associated costs and incomes • The approaches can range from a simple excel spreadsheet to a major competitive intelligence undertaking.

  9. What is a Data Warehouse (DWH)? • A decision support database that is maintained separately from the organization’s operational database • Support information processing by providing a solid platform of consolidated, historical data for analysis. • “A data warehouse is a subject-oriented,integrated, time-variant, historicaland non-volatilecollection of data in support of management’s decision-making process.”—W. H. Inmon

  10. Data Warehouse—Subject-Oriented • Organized around major subjects, such as customer, product, sales. • Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing. • Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.

  11. Data Warehouse—Integrated • Constructed by integrating multiple, heterogeneous data sources • relational databases, flat files, on-line transaction records • Data cleaning and data integration techniques are applied. • Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources • E.g., Hotel price: currency, tax, breakfast covered, etc. • When data is moved to the warehouse, it is converted.

  12. Data Warehouse—Time Variant • The time horizon for the data warehouse is significantly longer than that of operational systems. • Operational database: current value data. • Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years) • Every key structure in the data warehouse • Contains an element of time, explicitly or implicitly • But the key of operational data may or may not contain “time element”.

  13. Data Warehouse—Non-Volatile • A physically separate store of data transformed from the operational environment. • Operational update of data does not occur in the data warehouse environment. • Does not require transaction processing, recovery, and concurrency control mechanisms • Requires only two operations in data accessing: • initial loading of dataand access of data.

  14. Order processing • 2 sec response time • Last 6 months order • DATA WAREHOUSE • Last 5 years data • Response time 2 sec – 60minutes • Data is not modified • Product price/ Inventory • 10 sec response time • Last 10 price changes • Last 20 invent transaction • Marketing • 30 sec response time. • Last 2 years program

  15. Data Warehouse vs. traditional integration in Heterogeneous DBMS • Traditional heterogeneous DB integration: • Build wrappers/mediators on top of heterogeneous databases • Query driven approach • When a query is posed to a client site, a meta-dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a global answer set • Complex information filtering, compete for resources • Data warehouse: update-driven, high performance • Information from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis

  16. Data Warehouse (OLAP) vs. Operational DBMS (OLTP) • OLTP (on-line transaction processing) • Major task of traditional relational DBMS • Day-to-day operations such as purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. • OLAP (on-line analytical processing) • Major task of data warehouse system • Data analysis and decision making • Distinct features (OLTP vs. OLAP): • User and system orientation: customer vs. market • Data contents: current, detailed vs. historical, consolidated • Database design: ER + application vs. star + subject • View: current, local vs. evolutionary, integrated • Access patterns: update vs. read-only but complex queries

  17. OLTP vs. OLAP

  18. Why Create a Data Warehouse? Successfully implemented data warehouses can bring benefits to an organization as below: Performance: - Operational Systems and Data Warehouse Simplify - Make Complex Data from Many Systems Available in One Accuracy - Standardize and Cleanse Business Value - Provide the Foundation for the Business to Have Access to Information to Make Timely, Informed Decisions Direct Use - Non-IT personnel can make reports

  19. DWH complex queries • Typical data warehouse queries (Case study: Banking industry) • Which corporate customers are above the average account usage per month and how does this correlate to their business? • Who were the first hundred customers in Jan 2006 and how does this list compare with the list for the previous three years? • What is the revenue by destination, by month, by business unit, by region?

  20. Complexities of Creating a Data Warehouse • Incomplete errors • Missing Fields • Records or Fields That, by Design, are not Being Recorded • Incorrect errors • Wrong Calculations, Aggregations • Duplicate Records • Wrong Information Entered into Source System • Inconsistency errors • Inconsistent Use of Different Codes

  21. Best Practices • Data Warehousing is a process and not a project • Complete requirements and design • Prototyping is key to business understanding • Utilizing proper aggregations and detailed data • A full iterative approach is essential • Training is an on-going process • Build data integrity checks into your system

  22. DWH Drawbacks • High investment • The initial cost of building a data warehouse is very high and ROI cannot easily be explained. • Large storage • Data warehouse stores useful historical data of an enterprise. • Maintenance of source systems • If data source systems are not cleaned, we automatically get dirty data into the data warehouse. Decision markers using such data are likely to be mislead and their decisions may lead to loss of company revenue. • Qualified staff • Data warehouse building and maintenance requires skilled personnel.

  23. Return on Investment - Justification • New insights into • Customer habits • Developing new products • Selling more products • Cost savings and revenue increases • Cross-selling of products • Identify and target most profitable customers

  24. Conclusions • Building data warehouse is good but not sufficient. The data in a data warehouse has to be accessed by users and in order to access it; a BI tool has to be used.

More Related