1 / 32

Scala and Spark Training

https://www.learntek.org/scala-spark-training/<br><br>Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.

Learntek1
Download Presentation

Scala and Spark Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scala and Spark Training

  2. Scala and Spark Training – What is Scala? Scala and spark Training – Scala is a modern multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. Scala, the word came from “Scalable Language”, is a hybrid functional programming language which smoothly integrates the features of objected oriented and functional programming languages and it is compiled to run on the Java Virtual Machine. Scala has been created by Martin Odersky and released in 2003.

  3. Why Scala? There are the following reasons that encourages Scala learning. Many existing companies, who depend on Java for business critical applications, are turning to Scala to boost their development productivity, applications scalability and overall reliability. Scala  is a type-safe JVM language that incorporates both object oriented and functional programming features into an extremely concise, logical, simple and extremely powerful language. Copyright @ 2019 Learntek. All Rights Reserved.

  4. Scala creates a “better Java” alternative by remaining its syntax very close to the Java language syntax, so that to minimize the learning difficulty. Scala was created specifically with the goal of creating a better language, in contrast with those restrictive, overly tedious, or frustrating features of Java. Scala is a much cleaner and well organized language that is ultimately easier to use and increases productivity. Copyright @ 2019 Learntek. All Rights Reserved.

  5. What is Spark? Spark is a fast cluster computing technology, designed for fast computation in Hadoop clusters. It is based on Hadoop MapReduce programming and it extends the MapReduce model to efficiently use it for more types of computations, like interactive queries and stream processing. Spark uses Hadoop in two different ways – one is storage and another one is processing. As Spark is having its own cluster management computation, it uses Hadoop for storage purpose only. Copyright @ 2019 Learntek. All Rights Reserved.

  6. Spark is one of Hadoop’s sub project developed in 2009 in UC Berkeley’s AMP Lab by Matey Zaharia. It was Open Sourced in 2010 under a BSD license. It was donated to Apache software foundation in 2013, and now Apache Spark has become a top level Apache project from Feb-2014. Copyright @ 2019 Learntek. All Rights Reserved.

  7. Why Spark? Spark was introduced by Apache Software Foundation for speeding up the Hadoop software computing process. The main feature of Spark is its in-memory cluster computing that highly increases the speed of an application processing. Spark is designed to cover a wide range of workloads such as batch applications, iterative algorithms, interactive queries and streaming applications by reducing the management burden of maintaining separate tools. Copyright @ 2019 Learntek. All Rights Reserved.

  8. Apache Spark also have the following features. • Speed− Spark helps to run an application in Hadoop cluster, up to 100 times faster in memory and 10 times faster when running on disk by reducing number of read/write operations to disk and by storing the intermediate processing data in memory. Copyright @ 2019 Learntek. All Rights Reserved.

  9. Supports multiple languages− Spark comes up with 80 high-level operators for interactive querying and provides application development with built-in APIs in different languages in Java, Scala, or Python. • Advanced Analytics− Spark not only supports ‘Map’ and ‘reduce’ programming but it also supports SQL queries, Streaming data, Machine learning (ML), and Graph algorithms. Copyright @ 2019 Learntek. All Rights Reserved.

  10. The following topics will be covered in our Scala and Spark Training: Scala and Spark Training – Introduction to Scala Scala and spark Training – Overview of Scala Installing Scala Scala Basics IDE for Scala Scala Worksheet Copyright @ 2019 Learntek. All Rights Reserved.

  11. Scala Programming Variables & Methods Literals Reserved Words Operators Precedence Rules Operator Associativity Ways of Executing a Scala Program Expressions and Loops If Expression For Expression Usage of ‘yield’ keyword in For Expression Exception handling with Try Expression Match Expression While Loops Do-While Loops Copyright @ 2019 Learntek. All Rights Reserved.

  12. Functions in Scala Methods Nested Methods First class Function Higher Order Methods Function Literal Partially Applied Function Tail Recursion Closure Currying Control Abstraction Call-by-name Vs call-by-value Repeated Parameter passing mechanism Named Parameter mechanism Default parameter mechanism Copyright @ 2019 Learntek. All Rights Reserved.

  13. OOPs in Scala Classes & Objects Defining a Constructor Constructor Parameter Vs Class Parameter Singleton Object Companion Object Abstract Class Uniform Access Principle Access Modifiers Extending a Class Namespace in Scala Calling a superclass Constructor Dynamic Binding in Scala Final Member in Scala Class Scala Class Hierarchy Object Equality in Scala Factory Design Pattern in Scala Copyright @ 2019 Learntek. All Rights Reserved.

  14. Traits Introduction to Traits Inheritance in Traits Mixing a Trait Trait Vs Class Ordered Trait Example of Ordered Trait Stackable Modification behaviour of Trait Example of Stackable Modification Rules of mixing of multiple traits Copyright @ 2019 Learntek. All Rights Reserved.

  15. Scala Programming Packaging Package Different form of Scala Package Imports statement Different form of Import Package Object Implicit Imports Copyright @ 2019 Learntek. All Rights Reserved.

  16. Case Class & Pattern Matching Introduction to Case Class Introduction to Pattern Matching Example of Pattern Matching Wildcard Pattern Constant Pattern Variable Pattern Constructor Pattern Sequence Pattern Tuple Pattern Type Pattern Variable Binding Pattern Guard Sealed Class Option Data Type Usage of Option Data Type Pattern Usage Partial Function Case Class and Partial Function Usage of Pattern in For Expression Copyright @ 2019 Learntek. All Rights Reserved.

  17. Scala Collection  Immutable and Mutable collection Constructing object of Array, Set, List, Tuple, Map Detailed Discussion of various methods in List class and List Object List Construction Basic Operations like head, tail, is Empty on List List Pattern Example of using List Pattern Categories of methods in List First Order Methods in List Higher Order Methods in List Map vs flat Map Filtering a List Example of take While, drop While, span, partition Predicates over List Folding Over List Fold Left Vs Fold Right Copyright @ 2019 Learntek. All Rights Reserved.

  18. Scala and Spark Training – Introduction to Spark Introduction to Big Data Big Data Problem Scale-Up Vs Scale-Out Architecture Characteristics of Scale-Out Introduction to Hadoop, Map-Reduce and HDFS Introducing Spark Copyright @ 2019 Learntek. All Rights Reserved.

  19. Hortonworks Data Platform (HDP) using Virtual box Importing HDP VM image using Virtual box on local machine Configuring HDP Overview of Ambari and its components Overview of services configuration using Ambari Overview of Apache Zeppelin Creating, importing and executing notebooks in Apache Zeppelin Copyright @ 2019 Learntek. All Rights Reserved.

  20. IDEs for Spark Applications SBT and its overview Intellij Eclipse Resolving dependencies for Spark applications Copyright @ 2019 Learntek. All Rights Reserved.

  21. Spark Basics  Spark Shell Overview of Spark architecture Storage layers for Spark Initialize a Spark Context and building applications Submitting a Spark Application Use of Spark History Server Spark Components Spark Driver Process Spark Executor Spark Conf and Spark Context Spark Session object Overview of spark-submit command Spark UI Copyright @ 2019 Learntek. All Rights Reserved.

  22. RDDs Overview of RDD RDD and Partitions Ways of Creating RDD RDD transformations and Actions Lazy evaluation RDD Lineage Graph (DAG) Element wise transformations Map Vs FlatMap Transformation Set Transformation RDD Actions Overview of RDD persistence Methods for persisting RDD Persisting RDD with Storage option Illustration of Caching on an RDD in DAG Removal of Cached RDD Copyright @ 2019 Learntek. All Rights Reserved.

  23. Pair RDDs Overview of Key-Value Pair RDD Ways of creating Pair RDDs Transformations on Pair RDD ReduceByKey(), FoldByKey(),MapValues(), FlatMapValues(),keys() and Values() Transformation Grouping, Joining, Sorting on Pair RDD ReduceByKey() Vs GroupByKey() Pair RDD Action Copyright @ 2019 Learntek. All Rights Reserved.

  24. Launching Spark on cluster  Configure and launch Spark Cluster on Google Cloud Configure and launch Spark Cluster on Microsoft Azure Logging and Debugging a Spark Application Setting up a window environment for executing Spark Application using IDE Steps of using slf4j logging mechanism in Spark Application Attaching a debugger to Spark Application Example of debugging a Spark application running inside a cluster Copyright @ 2019 Learntek. All Rights Reserved.

  25. Spark Application Architecture  Spark Application Distributed Architecture Spark Application submission Mode Overview of Cluster Manager Example of using Standalone Cluster Manager Driver and its responsibilities Overview of Job, Stage and Tasks Spark Job Hierarchy Executor Spark-submit command and various submission options Yarn Cluster Manager Yarn Architecture Client and Cluster Deploy-mode Copyright @ 2019 Learntek. All Rights Reserved.

  26. Advance concepts in Spark Accumulator Broadcast RDD partitioning Re-partition RDD Determining RDD partitioner Copyright @ 2019 Learntek. All Rights Reserved.

  27. Spark SQL  Introduction to SparkSQL Creating SparkSession with Hive Support Data Frame Ways of Creating Data Frame Registering a Data Frame as View Data Frame Transformations API Data Frame SQL statement Aggregate Operations Data Frame Action Catalyst Optimizer Catalog API Copyright @ 2019 Learntek. All Rights Reserved.

  28. Limitation of Data Frame Introduction to Dataset Introduction to Encoder Creating Dataset Functional transformation on Dataset Loading CSV, JSON, Parquet format file in SparkSQL Loading and saving data from/in Hive, JDBC, HDFS, Cassandra Introduction to User-Defined-Function (UDF) Customizing a UDF Usage of UDF in DataFrame Transformations API Usage of UDF in Spark SQL statement Introduction to Window Function Steps of defining a window function Illustration of Window function usage Introduction to UDAF Customizing a UDAF Illustration of customized UDAF usage Copyright @ 2019 Learntek. All Rights Reserved.

  29. Spark Streaming  Introduction to data streaming Spark Streaming framework Spark Streaming and Micro batch Introduction of DStreams DStreams and RDD Word Count example using Socket Text Stream Streaming with Twitter feeds Setting up a Twitter App Resolving Twitter dependency in Spark Streaming Application Steps of creating Uber Jar Example of extracting hashtags from tweet data Troubleshooting Twitter Streaming issue in Spark Application Steps of creating Spark Streaming Application Architecture of Spark Streaming Stateless Transformations Copyright @ 2019 Learntek. All Rights Reserved.

  30. Twitter Streaming examples using stateless transformation Introduction to stateful Transformations Window Transformations Window Duration and Slide Duration Window Operations Naive and inverse window reduce operation Checkpoint Tracking State of an event using updateStateByKey operation Interact directly with RDD using transform () operation Example of HDFS file streaming Example of Spark-Kafka interaction Saving DStreams to external file system Copyright @ 2019 Learntek. All Rights Reserved.

  31. For more Training Information , Contact Us Email : info@learntek.org USA : +1734 418 2465 INDIA : +40 4018 1306 +7799713624 Copyright @ 2019 Learntek. All Rights Reserved.

More Related