1 / 31

Unified Monitoring and Analytics [Seamless Operational Visibility] in the Cloud

Unified monitoring and analytics solution for cloud instances, providing deep visibility and operational insight. Automating data collection and analysis to solve real-world pain points in cloud environments.

eakins
Download Presentation

Unified Monitoring and Analytics [Seamless Operational Visibility] in the Cloud

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ricardo Koller Canturk Isci Sahil Suneja Eyal de Lara Unified Monitoring and Analytics[Seamless Operational Visibility] in the Cloud HotCloud 2015 - Agentless Crawlers- Unified Monalytics Pipeline- Real-world Deployment - Applications CloudSight

  2. Monitoring & Analytics Designed for Cloud • What is great: • Density • Scale • Portability • Repeatability • Speed • Automation • What needs work: • Visibility • Operational Insight - Modernization of IT infra and SW delivery - Complex made simple - Unprecedented efficiency and TTV - Lots of shiny toys across IT lifecycle - Visibility into our environments remains an issue - Also lots of shiny toys for monitoring & analytics BUT: - Still based on traditional IT Principles! Our Goal:- Provide deep, seamless visibility into cloud instances - Drive operational analytics to solve real-world pain points Seamless Operational Visibility and Analytics | 2015

  3. …Monitoring & Analytics Designed for Cloud • What is emerging: • Diverse runtimes, patterns, instance modalities • All requiring the same kind of visibility and ops insight BUT: - We keep designing separate, siloed solutions for each Our Goal:- Provide deep, seamless and unified visibility into ALL cloud instances - Drive operational analytics to solve real-world pain points Seamless Operational Visibility and Analytics | 2015

  4. Key Driver:Agentless (Non-intrusive) Introspection Security Monitor Patch Relevance Compliance Audit Seamless Operational Visibility and Analytics | 2015

  5. Host Host Host VM Cont. Cont. Cont. OS Wkld OS Wkld Wkld Wkld Wkld Host Host Host OS OS Wkld VM Cont. Cont. Cont. OS Wkld OS A A Wkld Wkld Wkld Wkld A A A A A A A A Agent Agent A A A A A A BMS VM OS OS Container A A Wkld Agent Agent A A BMS VM Container Traditional Monitoring vs. Crawlers Seamless Operational Visibility and Analytics | 2015

  6. Monitor Me! Unified Agentless Monitoring with Crawlers • What it is: • Seamless, always-on monitoringBuilds upon introspection abstractions • No agents, no guest credentials/access • Works out of the box (if you check the box;)) • Monitoring built into the platform not in end-user systems • Advantages: • Consumability (lower the barrier to data collection and analytics) • Security (No attack surface, credentials, external connections in guest) • No complexity to end-user (They do nothing, all they see is the service) • Compliance (Maintenance out of guest context, no SW to install/manage) • Availability (From birth to death + can inspect even is guest is defunct) • Guest Agnostic (Same crawler for all) • Decoupled (From guest context and overhead/side-effect concerns) • Monitoring done right for the processes of the Cloud OS Seamless Operational Visibility and Analytics | 2015

  7. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. App App App App App App App App App App App App Docker Hosts Docker Hosts Unified Cloud Monitoring Pipeline Architecture: Crawlers  Index  Query APIs |  Monalytics Analytics: - Failure notification - Healthcheck - Resource Usage - Application Insight - Topo Discovery - Drift Detection - Container Sprawl - Depl. Validation - Config analytics - Compliance Svc - Vulnerability Scan : Search, query APIs I N D EX Data Bus Infra state, events and metadata State, Ops, Event data from CTs/VMs/BMs Our Approach:- Seamlessly “crawl” the cloud like we crawl the web - Query/mine the cloud like we query/mine the web Seamless Operational Visibility and Analytics | 2015

  8. Crawler: How it Works for VMs APP { ....... ....... } Hypervisor Cloud Analytics APP Analytics Apps VM OS Crawl Logic APP Memory Crawl API Frames KB Disk MEM MEM View Structured view of VM states Disk Crawl API Disk View • Leverage VM Introspection (VMI) techniques to access VM Mem and Disk state (We built bunch or our own optimizations that make this very efficient and practical) • Can even remote both (decouple all from VM and host) • Almost no new dependencies on host • Currently support 1000+ kernel distros Seamless Operational Visibility and Analytics | 2015

  9. Crawler: How it Works for Containers • Leverage Docker APIs for base container information • Exploit container abstractions (namespace mapping and cgroups) for deeper insight • Provide deep state info at scale with no visible overheads to end user 1) Get visibility into container world by namespace mapping 2) Crawl the container (Crawler dependencies still borrowed from host. No need to inject into container!) 3) Return to original namespace 4) Push data to backend index Seamless Operational Visibility and Analytics | 2015

  10. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. Cont. App App App App App App App App App App App App Docker Hosts Docker Hosts Docker Hosts Where We Are Today: Running in Prod in IBM Bluemix Containers • Monitoring & Logging for all containers in IBM Bluemix Container Service”Users do not have to do anything to get this info. It is readily available by default” Provisioning Tenancy Info Container Cloud o-o-b state crawler o-o-b log crawler Metrics & Logs Bus Multitenant Index UI Engine Seamless Operational Visibility and Analytics | 2015

  11. Kicking the Tires Just start a Bluemix Container (https://console.ng.bluemix.net/) Go to Container Overview(Metrics show up in few mins) Seamless Operational Visibility and Analytics | 2015

  12. …Kicking the Tires Go to Monitoring and Logs >> Monitoring Seamless Operational Visibility and Analytics | 2015

  13. …Kicking the Tires Go to Monitoring and Logs >> Logging Seamless Operational Visibility and Analytics | 2015

  14. Example App: Topology Discovery Live discovered topology Across: instances, applications, process hierarchy Track changes wrt previous state Verify desired comm. patterns Detect unexpected connections Time Grey= closed connectionGreen= new connection Seamless Operational Visibility and Analytics | 2015

  15. Example App: Vulnerability Scan Space/Time matrix shows what parts of the Appare vulnerable, and when the problem was detected Link to vulnerability Activity score indicates how critical thisvulnerability is for overall App health Seamless Operational Visibility and Analytics | 2015

  16. Coming Soon to Bluemix Containers: Vulnerability Advisor Seamless Operational Visibility and Analytics | 2015

  17. Parting Remarks In the paper: • Unified data pipeline • VM introspection • Container introspection • Perf & scalability evaluation • Applications In other work: • Near-Field Monitoring[Sigmetrics’14] (UofT) • VMI Perf & Consistency[VEE’15] (UofT) • Distributed Streaming VMI [IC2E’14] (CMU) • SW Discovery by Example[BigData’14] (BU) • System as Data[Tutorial ASPLOS’14] Please try it for yourself: • Bluemix Platform:https://console.ng.bluemix.net/ Follow us and see our articles & blogs: • @IBMBluemix | @IBMContainers | @IBMResearch | @canturkisci Seamless Operational Visibility and Analytics | 2015

  18. See Our Demos: http://is.gd/OrigamiDemos Thanks! • What if we can provide out-of-the-box monitoring services on any instance… merely by the act of deploying it in our cloud? • What if we can run compliance checks for online and offline instances … without any perturbation to the guests? • What if we can deliver always-on security/log/config monitoring on an instance exactly when it matters… even if it is disconnected, hung or compromised? Seamless Operational Visibility and Analytics | 2015

  19. Seamless Operational Visibility and Analytics | 2015

  20. Why the Traditional In-band Solutions Begin to Suck Scenario 1: Ephemeral Instances • Instances of App A fail shortly after provisioning. Reprovisioning automation results in the same systemic failure. • How to root cause the issue when systems keep dying before we can access them? • How to avoid cascading failures? Scenario 2: Unresponsive Systems • My app stopped responding. Access to the instance fails, and all my in-app monitors went completely silent. • In-band monitoring solutions fail at the exact moment we need them the most. • Can we provide a better, always-on solution for health, monitoring, compliance, etc.? Scenario 3: Agent Updates across Entire Inventory • Transitioning from shiny tool S to shinier tool E for operational monitoring. Need to reprovision each of our 1000 instances with the new runtime component. • DevOps and CD surely helps; but still, how fun:) • Is the risk worth the effort? How often can we do these (we have baggage)? Seamless Operational Visibility and Analytics | 2015

  21. Crawler: What We Actually Collect (and Annotate) From Container/VM Config Annotator - OS Info - Processes- Disk Info - Metrics - Connection Info - Packages - Files - Config Info Vulnerability Annotator Compliance Annotator Docker Specific - Docker metadata (docker inspect) - CPU metrics (/cgroup/cpuacct/) - Memory metrics (/cgroup/memory) - Docker history Seamless Operational Visibility and Analytics | 2015

  22. All Crawl Features from Container Service (291) Seamless Operational Visibility and Analytics | 2015

  23. AMS vs. Current Affairs [Use Case Map] Problem Diagnosis: • NOW: Very intrusive (core dumps, extract state, run diagnostic code,…) • All fail when the system is hosed • AMS: Always on [Live OpenStack Oops Demo] Security: • NOW: Commonly in-band solutions (Still, most introspection-friendly field) • Information from guest OS can be tainted easily • AMS: Leverages raw, out-of-band info [RConsole Rootkit Detection] Health/Performance Monitoring: • NOW: Rely on in-system agents, i.e., ITM, QRadar ALE • The information is only as good and as available as the system/agent • AMS: Touchless, still available when all else fails [QRadar SIEM] Compliance: • NOW: Collect data, run queries on each endpoint, i.e., TEM agents • Scale and maintenance issues; time-correlated events (i.e., compliance storms) • AMS: Decouple system-observations, query index instead [E2E RC2 Demo] Seamless Operational Visibility and Analytics | 2015

  24. Live Oops RConsole on Openstack Flow • Pull vanilla image from Ubuntu cloud images • Add to Glance repo • Provision + Start VM from img • Start monitoring it even during boot, starting services • No need to ever log into VM for data collection: • Crash the system • You are helpless unless agentless • [Still] can track system state with RConsole [cloud-images.ubuntu.com] [SoftLayer] [OpenStack] Compute Node Glance RConsole 1,2 3 > ps VM 4 Crawl Seamless Operational Visibility and Analytics | 2015

  25. Demo: Security Event Monitoring with QRadar in RC2 RC2 QRadar OVERVIEW: 2 QRadar instances side by side 1st shows the standard in-VM agent (ALE) based monitoring 2nd shows out-of-VM introspection-based monitoring 2nd works even when the VM loses network, hangs, or agent is compromised! Hyper ON VM [Win2003] In-VM path A OS Out-of-VM path Origami Wherever KEY POINTS: 1. TOUCHLESS: Nothing runs on or peeks inside the target VM! 2. THIN RED LINE:The only requisite is a read/only target to the VM disk image! Converter for QRadar Data Format CrawlerVM Dispatcher VM REST Out-of-VM Event logIntrospection CloudMgr VM Seamless Operational Visibility and Analytics | 2015

  26. Origami Demos Seamless Operational Visibility and Analytics | 2015

  27. …Origami Demos Seamless Operational Visibility and Analytics | 2015

  28. …Origami Demos Seamless Operational Visibility and Analytics | 2015

  29. …Origami Demos Seamless Operational Visibility and Analytics | 2015

  30. Contrasting Views of System Lifetime Traditional: Crawling Systems as Documents: Seamless Operational Visibility and Analytics | 2015

  31. Monitor Me! Host Host Host VM Cont. Cont. Cont. OS Wkld OS A A Wkld Wkld Wkld Wkld A A A A A A A A Agent Agent A A A A A A OS OS A A Wkld Agent Agent A A BMS VM Container Agentless Monitoring with Crawlers Embed data collection into platform, not in customer systems! • Seamless, always-on monitoring • Consumable: No agents, no guest credentials/access • Non-intrusive: works out of the box • Tamper proof: Can inspect unresponsive, defunct system • Guest Agnostic: Same crawler for all systems • Decoupled from guest context Docker Opportunity • Monitoring done right for the processes of the Cloud OS • The only feasible way to monitor containers; solves limitations of traditional approaches[Ephemeral systems, Scale, Bootstrapping, Bring your own image, Collector footprints vs. lightweight containerization] Seamless Operational Visibility and Analytics | 2015

More Related