EAS-030 Spark Scala Kubernetes (Piloting)

EAS-030 Spark Scala Kubernetes (Piloting)
Online
English
EAS-030
EAS-030 Spark Scala Kubernetes (Piloting)
Sign Up
Location
Online
Language
English
Code
EAS-030
Schedule and prices
600.00 *
Training for 7-8 or more people? Customize trainings for your specific needs
EAS-030 Spark Scala Kubernetes (Piloting)
Sign Up
Location
Online
Language
English
Code
EAS-030
Schedule and prices
600.00 *
Training for 7-8 or more people? Customize trainings for your specific needs

Description

Participants will embark on an enriching voyage through the Spark universe powered by Scala, beginning with a foundational understanding of Spark's architecture and seeing its edge over Hadoop's MapReduce.
After completing the course, a certificate
is issued on the Luxoft Training form

Objectives

  1. Foundational Spark Principles: Dives into Spark's foundational concepts and architecture, comparing its efficiency to Hadoop's MapReduce, and exploring its diverse resource managers.
  2. Spark & Kubernetes Synergy: Equips participants with knowledge about the containerization of Spark applications, understanding Kubernetes dynamics, and efficient deployment techniques.
  3. Data API Proficiency: Delves deep into Spark's high-level Data APIs - DataFrame and DataSet - highlighting differences, parallelization, and optimal storage methods.
  4. External Data Management Mastery: Focuses on robust techniques for data interaction with diverse external storages, optimizing data formats, and efficient data transfers.
  5. Spark Optimization & Streamlining: Addresses the core challenges in Spark, understanding optimization strategies, and diving into structured streaming techniques and applications.

Target Audience

Developers, architects

Prerequisites

Basic Java, Scala programming skills. Unix/Linux shell familiarity. Experience with databases (Kafka is optional).

Roadmap

  • Spark concepts and architecture (theory 2h 30m, practice 1h 30m)

    Explore Spark's superiority over Hadoop's MapReduce with hands-on examples. Dive into Lambda architecture, understand batch vs. streaming. Master Spark's resource managers: Kubernetes, YARN, Standalone. Learn to initiate Spark applications. Comprehensive definitions included.


  • Containerization and deploy Spark Applications to Kubernetes - (theory 1h, practice 1h)

    Master containerization: delve into Kubernetes terminology. Compare Kubernetes with YARN. Grasp dynamic resource allocation. Learn to containerize and deploy Spark on Kubernetes. Kickstart Spark applications seamlessly.
  • High Level Data API: DataFrame, DataSet

    Explore high-level Data APIs: DataFrame & DataSet. Unravel differences between RDD, DataFrame, and DataSet. Learn creation, parallelization techniques. Dive into DataFrame & DataSet analysis, control via plans and DAGs. Master saving methods to HDFS, FTP, S3.

  • Loading data from/in external storages

    Master data loading techniques from external storages: Dive into reading/writing from HDFS, S3, FTP, FS. Choose optimal data formats. Learn parallelized JDBC interactions. Create DataFrames & DataSets from Kafka topics. Efficiently load data into Cassandra.

     

  • Spark optimization cases

    Delve into Spark optimization scenarios: Address 'out of memory' issues, manage small files in HDFS, correct skewed data, enhance join speeds, optimize large table broadcasts, resource sharing strategies, and leverage AQE & DPP for performance tuning.

  • Testing Spark Applications

    4 levels of quality for Spark Application

    Unit Testing for Spark Application

    Problems with Unit testing Spark Application

    Libraries and Solutions

  • Spark Structure Streaming

    Streaming DataFrame & Dataset

    DF, DS based on the Kafka Topic

    Loading Data to Cassandra

    Working with Spark, Cassandra State

    Optimization features
  • Show Entire Program
Schedule and prices
View:
Register for the next course
Registering in advance ensures you have priority. We will notify you when we schedule the next course on this topic
+
Your benefits
Expertise
Our trainers are industry experts, involved in software development project
Live training
Facilitated online so that you can interact with the trainer and other participants
Practice
A focus on helping you practice your new skills
Залишилися запитання?
Зв'яжіться з нами
Thank you!
The form has been submitted successfully.