1 / 15

Spark Integration Into an Enterprise Stack

Spark Integration Into an Enterprise Stack. Open Source Successes & Challenges. About the presenter. Zen-Empiricist Director of WANdisco Bigdata Engineering In charge of delivering company’s enterprise grade NonStop Hadoop solution ASF Hadoop , MRUnit committer

lilia
Download Presentation

Spark Integration Into an Enterprise Stack

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spark Integration Into an Enterprise Stack Open Source Successes & Challenges

  2. About the presenter Zen-Empiricist Director of WANdiscoBigdataEngineering In charge of delivering company’s enterprise grade NonStopHadoop solution ASF Hadoop, MRUnit committer ASF Bigtop’s co-author Spark/Shark contributor Apply with caution: highly abrasive (according to most - now former - managers) Konstantin (Cos) Boudnik Shark Integration: Challenges and Lessons Learnt

  3. Open-source is a force of natural evolution Most apparent characteristics: Fail-fast on your own dime Hard or impossible to control by authority (!) Resistant to political correctness bias (aka political bulls#$t) Creates huge competitive advantage Resulting in Highly successful projects Innovations up to the limit Technologically disruptive Rules the world (once matured) Empirical evidences: Everything on the planet is “Powered by Linux” “Bad” news: Android market share will never double again Firefox is THE web-browser of the world I ran out of the slide space and my time slot is limited... Anarchy: ἀν + ἀρχός (an + arkhos) without a ruler Shark Integration: Challenges and Lessons Learnt

  4. What “open source” often-time is Open=> anyone can do what they’re most interested in doing Innovative => creates formats & standards as it goes; abandon them in passing Stable => we’ll fix it in the next release, Backward compatible => we might break it, but we’ll fix it Fault tolerant and, at least, highly available => if you configure the hell out of it Configuration management => shall scripts or Python to generate configuration Deployment management (packages and Puppet) => here’s your tarball Supported (there’s a throat to choke) => “Gone fishing!” Secure => million eyeballs will find all you bugs in no time I am not bashing the open-source: it is my bread & butter Shark Integration: Challenges and Lessons Learnt

  5. Let’s call spade a spade Compatible with standards, scalable Stable: features set, release schedules, bug fixes, upgrades Backward compatible with itself Fault tolerant and, at least, highly available Configuration management (you know your environment) Deployment management (packages and Puppet) Supported (there’s a throat to choke) Secure … and more What “enterprise grade” really is Shark Integration: Challenges and Lessons Learnt

  6. The goals are aligned. How about semantics? The devils is in the details Shark Integration: Challenges and Lessons Learnt

  7. Case study: major telecom SI Open JDK7 Guess what? Not everybody are in love with Larry Ellison Hive 0.11’ish It is 3 light years ahead of Hive 0.9 and 5 light years behind an enterprise grade Spark 0.8 Hello Apache Incubation! Shark 0.8’ish What we have built Shark Integration: Challenges and Lessons Learnt

  8. How the stack looks like? What it implies for the development and customers alike Shark Integration: Challenges and Lessons Learnt

  9. Fixes that span multiple components Memory leaks: JobConf hold by ThreadLocal Shark Integration: Challenges and Lessons Learnt

  10. What does it mean? それが何を意味している Semantic and toolset barriers between JVM languages Shark Integration: Challenges and Lessons Learnt

  11. Unsynchronized release trains Upstream components live their own lives oftentimes Shark Integration: Challenges and Lessons Learnt

  12. Impatient Customers I want everything on the menu! NOW! Shark Integration: Challenges and Lessons Learnt

  13. What else can possibly go wrong? “Hold my beer!” (Famous last words) Shark Integration: Challenges and Lessons Learnt

  14. Lessons learnt & principles applied Proper system integration Git & well-thought branching model ASF Bigtop as the integration point Close collaboration with open source community All fixes and features are offered to appropriate projects; most are accepted Tireless and careful back-poring Continuous Integration and Delivery Simplifying development where is possible Switch from “org.apache.hive” to “edu.berkeley.cs.shark” Keep open your version control system Education and expectations management “released” in open-source not always means “usable in the datacenter” “What to do, what to do?” (r. Bender) Shark Integration: Challenges and Lessons Learnt

  15. Thank you Konstantin.Boudnik@wandisco.com @c0sin

More Related