170 likes | 330 Views
Building a smarter planet: Financial Services. Exploding Demands for Big Data, Analytics, Risk Management, Ultra-low Latency and Compute Power Requires Optimized HPC Infrastructures. Robert Brinkman Infrastructure Architect for Banking and Financial Markets IBM Banking Center of Excellence.
E N D
Building a smarter planet: Financial Services Exploding Demands for Big Data, Analytics, Risk Management, Ultra-low Latency and Compute Power Requires Optimized HPC Infrastructures Robert BrinkmanInfrastructure Architect for Banking and Financial MarketsIBM Banking Center of Excellence
Panel Emile Werr VP, Global Data Services Global Head of Enterprise Data Architecture NYSE Euronext Vikram MehtaVice President, IBM System NetworkingIBM Corp. Nick Werstiuk Product Line ExecutiveIBM Platform Computing Dino Vitale Director, Cross Technology Services Morgan Stanley
Workload Optimized Stacks Applications IBM Provided, ISVs, Partners, Custom Messaging and Security Cloud Transaction Low Latency Grid Appliances Stack Stack & Packages Stack Stack Discrete Components or Applications High Transaction Rates Specialized Workloads High Message Rates Big Data Big Compute Packaged Hardware and Software Data Value Decay Variable Workload Complex Data Models • Financial Markets Industry Imperatives • Re-engineer for profitable growth: Renewed focus on the customer, Near real time analytics • Improve the trade life cycle: Cloud and business process outsourcing • Optimize enterprise risk management: Data driven transformation and common industry services
Dino Vitale Director, Cross Technology Services Morgan Stanley
Morgan Stanley: Road to Compute As a Service • Trends • Maximize efficiency of compute infrastructure • Cost / run-rate • Utilization – more with less, linear scale, sharing • Operational normalization • Challenges • Phasing • Dynamic provisioning and scaling on-demand of resources to applications according to varying business needs and SLA • Multi-tenant workload protection • Application design and dependency management • Utility charge-back model options: pay-per-use, fixed allocation, hybrid approach • Sharing resources based on work load supply and demand • BCP • Convergence opportunities with “Big Data” • Increasing data volumes • Adaptive/real-time Scheduling • Resource management • Metrics / Data mining
On-Demand DATA IN High Performance environment Emile Werr, VP, Global Data Services Global Head of Enterprise Data Architecture & Identity Management
Technology Challenges • Big Data (billions of transactions and multi-terabyte captured daily) • Speed and business agility are essential to our business • Different viewpoints and data patterns need to be analyzed • Data coming out of a Trading Plant is not user-friendly • Correlating disparate data & integration • Moving large data around is expensive and complex • System Capacity requirements need to efficiently handle 5x of our Avg daily volume. • Data Spikes – the day after Flash Crash volume peaked over18.4 Bn transactions for NYSE Classic Matching engine (this excludes Options and other markets like Arca, Amex, Liffe, Euronext, etc.) • Transaction volume growth sustained year-over-year • Data needs to be readily available for a min of 7 years for Compliance • It is too expensive to keep it all online • Change is constant Global Data Services 4
Price level for calculating Shares Ahead & Shares Available Order arrived: BUY 10 @ 20.09 Full Quote Size- Best Quote size from the last published best quote Data Architecture Practice Financial Services, Regulatory & Compliance Expertise USE CASE: Market Reconstruction for Trading Surveillance The Electronic Book (NYSE DBK) and Market Depth needs to be reconstructed and accessible via Fast Database Who Traded Ahead or Interpositioning ? This can be answered by a Database Query • Trading systems generate vast transaction volumes at high speeds • The GRID is utilized to transform, normalize and enrich the time-series data using massive parallel computing. This is done as EOD or Intra-Day batch processing. • Date-Level Table scans (Queries) need also massive parallel processing (MPP) • Appropriate technologies need to be utilized (10gb Network, Virtualized CPU/MEM, Appliance Databases, Scalable Storage Pools) 8
Data Lifecycle Management Methodology • Trading Data • Market Data • Ref Data • User Generated Data • Transform, Normalize, Enrich • Partition, compress and archive in storage pools • Create Metadata (mappings) Data Transformation & Archive Data Capture On-Demand Data (ODD) Enterprise Systems End-User Workflow User Analytics “Business Intelligence” • Utilize MPP Databases & HDFS • Integrate Reporting Tools • Facilitate User Collaboration • Capture Knowledge (KM) • Automate Data Archive & Purge • Secure Data Access & Navigation • Load, Extract, Stream, Filter, Transform, Purge • User-driven Data Mart Provisioning (“Sandboxing”) • Schema Change Capture (“Data Structure Lineage”) Global Data Services 3
Managed Data Services & Data Flow Automation Business Demand Data Capture Transformation & Archive Data Virtualization & Abstraction Analysts Data Scientists Researchers Apps Admins Message Bus Scale-Out Grid Fabric distributed CPU/MEM Feed Handler Data Provisioning Data Tools files Data Pump Hadoop Continuous Flow (Trickle Batch) Storage Pools Netezza Analytics Data Warehouse Data Services • Standardization & Consistency • Agile Framework – Metadata Driven • Metering, Monitoring &Tracking • Common Secured Access • Automation & Workflow • Simplified Access & Administration • File & Database Virtualization • Fast Processing & Data Movement • Scalable • Reliable 10
Vikram MehtaVice President, IBM System NetworkingIBM Corporation
Nick Werstiuk Product Line ExecutiveIBM Platform Computing
Convergence of Compute and Data Data Intensive Compute Intensive Compute and Data Intensive Workload All – Structured + Unstructured Unstructured Video, E-Mail, Web Structured RDBMS, Fixed records Data type Risk Analytics Sentiment Analysis/CRM CEP Application Use Case BI Reporting Simulation Gaming Streaming Trading Genomics AML/Fraud ETL Pricing Characteristics “Real Time” Intraday Daily Monthly Quarterly Disk & Tape, SMP & Mainframe, SAN/NAS Infrastructure Data Warehouses Compute grid, Data caches, In-memory grid, Shared services CPU + GPU Commodity processors + storage Infrastructure Dedicated servers, Appliances, FPGAs
Support for Diverse Workloads & Platforms Geo-spatial integration, Name classification Signal processing Metadata generation, File classification, Batch analysis Search, Analysis, Concept Recognition Data Intensive Apps B A C C C A C B C C C C C B B B C C B A A A A A A B B B B A C A A A A A A A B C B B B C C C B B B B C A C C B Workload Manager D D D D D D D D D D D D D Resource Orchestration 14
Why IBM Platform Symphony is faster and more scalable Latency Inefficient scheduling, polling model & heavy-weight transport protocols limit scalability. Other Grid Servers With a zero-wait time “push model” and efficient binary protocols, Symphony scales until the “wire” is saturated Symphony Scale
HPC Cloud – Multiple Approaches and Paths to Value Infrastructure Management HPC “In the Cloud” • Cluster consolidation into an HPC Cloud • Self-service cluster provisioning and management • Workload-driven dynamic cluster • ‘Bursting’ to Cloud Providers • Hosted HPC in the cloud • Enable HPC Cloud Service Providers Leverage the public cloud opportunity, either to tap into additional resources, or offer their own HPC cloud services Build out a more dynamic HPC infrastructure as their HPC Cloud
Questions Building a smarter planet: Financial Services