290 likes | 661 Views
WebFOCUS Hyperstage. Analyze/Report from large Volumes of Data. Information Builders May 11, 2012. Information Builders (Canada) Inc. WebFOCUS Higher Adoption & Reuse with Lower TCO. Extended BI. Mobile Applications. Data Updating. High Performance Data Store. Predictive Analytics.
E N D
WebFOCUS Hyperstage Analyze/Report from large Volumes of Data Information Builders May 11, 2012 Information Builders (Canada) Inc.
WebFOCUSHigher Adoption & Reuse with Lower TCO Extended BI Mobile Applications Data Updating High Performance Data Store Predictive Analytics Visualization & Mapping Enterprise Search Core BI Performance Management MS Office & e-Publishing Query & Analysis Dashboards Data Warehouse & ETL Business to Business Information Delivery Reporting Business Activity Monitoring Data Profiling & Data Quality Master Data Management Extensions to the WebFOCUS platform allow you to build more application types at a lower cost
WebFOCUS High Performance Data Store Extended BI Mobile Applications Data Updating High Performance Data Store Predictive Analytics Visualization & Mapping Enterprise Search Core BI Performance Management MS Office & e-Publishing Query & Analysis Dashboards Data Warehouse & ETL Business to Business Information Delivery Reporting Business Activity Monitoring Data Profiling & Data Quality Master Data Management Extensions to the WebFOCUS platform allow you to build more application types at a lower cost
Today’s Top Data-Management ChallengeBig Data and Machine Generated Data Data Storage Machine- Generated Data Human-Generated Data Time
IT Managers try to mitigate these response times ….. How Performance Issues are Typically Addressed – by Pace of Data Growth When organizations have long running queries that limit the business, the response is often to spend much more time and money to resolve the problem Source: KEEPING UP WITH EVER-EXPANDING ENTERPRISE DATA ( Joseph McKendrickUnisphere Research October 2010)
Classic Approaches and ChallengesData Warehousing More Kinds of Output Needed by More Users, More Quickly Limited Resources and Budget More Data, More Data Sources Real time data 0101010101010101010101010101 10 0101010101010101010101010101 10 101 Multiple databases 0101010101010101010101010 0101010101010101010101010 10 10 01 10 External Sources 101 0101010101010101010101 1 0 0101010101010101010101 1 10 10 10 10 0 101 01 01 10 10 1 0 01 1 1 01 0 0 1 1 10 10 0 0 101 10 01 10 1 01 1 01 10 10 1 0 0 01 10 0 101 10 1 101 101 10 101 1010 1 10 1 010 01 0 1 0 010 1 1010 1 0 1 10 0 01 0 1 0 01 0 0101 10 101 1 01 0 101 0 0 101 0 • Labourintensive, heavy indexing, aggregations and partitioning • Hardware intensive: massive storage; big servers • Expensive and complex Traditional Data Warehousing
Classic Approaches and ChallengesData Warehousing – Growing Demands New Demands: Larger transaction volumes driven by the internet Impact of Cloud Computing More -> Faster -> Cheaper Data Warehousing Matures: Near real time updates Integration with master data management Data mining using discrete business transactions Provision of data for business critical applications Early Data Warehouse Characteristics: Integration of internal systems Monthly and weekly loads Heavy use of aggregates
Classic Approaches and ChallengesDealing with Large Data • INDEXES • CUBES/OLAP
Classic Approaches and Challenges Limitations of Indexes • Increased Space requirements • Sum of Index Space requirements can exceed the source DB • Index Management • Increases Load times • Building the index • Predefines a fixed access path
Classic Approaches and Challenges Limitations of OLAP • Cube technology has limited scalability • Number of dimensions is limited • Amount of data is limited • Cube technology is difficult to update (add Dimension) • Usually requires a complete rebuild • Cube builds are typically slow • New design results in a new cube
Limitations of RowsThese Solutions Contribute to Operational Limitations • Impediments to business agility • wait for DBAs to create indexes or other tuning structures, thereby delaying access to data. • Indexes significantly slow data-loading operations and increase the size of the database, sometimes by a factor of 2x. • Loss of data and time fidelity: • ETL operations typically performed in batch during non-business hours. • Delay access to data, often result in mismatches between operational and analytic databases. • Limited ad hoc capability: • Response times for ad hoc queries increase as the volume of data grows. • Unanticipated queries (where DBAs have not tuned the database in advance) can result in unacceptable response times. • Unnecessary expenditures: • Attempts to improve performance using hardware acceleration and database tuning schemes raise the capital costs of equipment and the operational costs of database administration. • Added complexity of managing a large database diverts operational budgets away from more urgent IT projects.
The Limitation of Rows • The Ubiquity of Rows 30 columns Row-based databases are ubiquitous because so many of our most important business systems are transactional. Row-oriented databases are well suited for transactional environments, such as a call center where a customer’s entire record is required when their profile is retrieved and/or when fields are frequently updated. 50 millions Rows But - Disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query.
Pivoting Your PerspectiveColumnar Technology Employee Id Name Location Sales 1 Smith New York 50,000 2 Jones New York 65,000 3 Fraser Boston 40,000 4 Fraser Boston 70,000 Row Oriented (1, Smith, New York, 50000; 2, Jones, New York, 65000; 3, Fraser, Boston, 40000; 4, Fraser, Boston, 70000) • Works well if all the columns are needed for every query. • Efficient for transactional processing if all the data for the row is available Column Oriented (1, 2, 3, 4; Smith, Jones, Fraser, Fraser; New York, New York, Boston, Boston, 50000, 65000, 40000, 70000) • Works well with aggregate results (sum, count, avg. ) • Only columns that are relevant need to be touched • Consistent performance with any database design • Allows for very efficient compression
IntroducingWebFOCUS Hyperstage • Mission • Improve database performance for WebFOCUS applications with less hardware, no database tuning, and easy migration • What is WebFOCUS Hyperstage • High performance analytic data store • Designed to handle business-driven queries on large volumes of data • without IT intervention. • Easy to implement and manage, Hyperstage provides answers to your business users need at a price you can afford • Advantages • Dramatically increase performance of WebFOCUS applications • Disk footprint reduced with powerful compression algorithm = faster response time • Embedded ETL for seamless migration of existing analytical databases • No change in query or application required • Includes optimized Hyperstage Adapter • WebFOCUS metadata can be used to define hierarchies and drillpaths to navigate the star schema
Introducing WebFOCUS HyperstageHow it is architected Hyperstage Engine Knowledge Grid Compressor Bulk Loader Combines a columnar database with intelligence we call the Knowledge Grid to deliver fast query responses. • Unmatched Administrative Simplicity • No Indexes • No data partitioning • No Manual tuning Improve database performance for WebFOCUS applications with less hardware, no database tuning, and easy migration
Introducing WebFOCUS HyperstageWhat it means for Customers • Self-managing: 90% less administrative effort • Low-cost: More than 50% less than alternative solutions • Scalable, high-performance: Up to 50 TB using a single industry standard server • Fast queries: Ad hoc queries are as fast as anticipated queries, so users have total flexibility • Compression: Data compression of 10:1 to 40:1 means a lot less storage is needed, it might mean you can get the entire database in memory!
Introducing WebFOCUS HyperstageHow it works Create Information (Metadata) about the data, and, upon Load, automatically … • Stores it in the Knowledge Grid (KG) • KG Is loaded into Memory • Less than 1% of compressed data Size Uses the metadata when Processing a query to Eliminate / reduce need to access data • The less data that needs to be accessed, • the faster the response • Sub-second responses when answered by KG Architecture Benefits • No Need to partition data, create/maintain indexes • projections, or tune for performance • Ad hoc queries are as fast as static queries, • so users have total flexibility
WebFOCUS Hyperstage EngineHow it works Column Orientation • Smarter Architecture • No maintenance • No query planning • No partition schemes • No DBA Knowledge Grid–statistics and metadata “describing” the super-compressed data Data Packs – data stored in manageably sized, highly compressed data packs Data compressed using algorithms tailored to data type
WebFOCUS HyperstageThe Big Deal • No indexes • No partitions • No views • No materialized aggregates • Value proposition • Low IT overhead • Allows for autonomy from IT • Ease of implementation • Fast time to market • Less Hardware • Lower TCO No DBA Required!
Example – Focus to Hyperstage Compression 243639 Rows