Apache Spark Fundamentals

Apache Spark Fundamentals
Duration
26 hours
Course type
Online
Language
English
Duration
26 hours
Location
Online
Language
English
Code
EAS-017
Training for 7-8 or more people? Customize trainings for your specific needs
Apache Spark Fundamentals
Duration
26 hours
Location
Online
Language
English
Code
EAS-017
€ 700 *
Training for 7-8 or more people? Customize trainings for your specific needs

Description

certificate
After completing the course, a certificate
is issued on the Luxoft Training form

Objectives

During the training participants will:

  1. Write a Spark pipeline via functional Python and RDDs; 
  2. Write a Spark pipeline via Python, Spark DSL, Spark SQL and DataFrame; 
  3. Draw architecture with different sources; 
  4. Write a Spark pipeline with external systems (Kafka, Cassandra, Postgres) which works in parallel modes; 
  5. Resolve problems with slow joins. 

After the training, participants will be able to build a simple PySpark application and execute it on the cluster in parallel mode.

Target Audience

  • Software developers
  • Software architects

Prerequisites

Basic Java, Python, Scala programming skills. Unix/Linux shell familiarity. Experience with databases is optional.

Roadmap

  • Spark concepts and architecture
  • Programming with RDDs: transformations and actions
  • Using key/value pairs
  • Loading and storing data
  • Accumulators and broadcast variables
  • Spark SQL, DataFrames, Datasets
  • Spark Streaming
  • Machine Learning using MLLib and Spark ML
  • Graph analysis using GraphX
Still have questions?
Connect with us