Unique knowledge and skills will give you real advantage on the market.
Apache Spark Programming (Spark 105): 3 day Instructor Led Public Class (Warsaw)
We are first
First certified by Databricks training in region.
Conducted by Databricks Certified Trainer.
Organisation details: Three-day (03-05-12.2018), onsite instructor-led course
Location: Poland, Warsaw (City Centre)
This course is designed for data engineers, analysts, architects; software engineers; IT operations; and technical managers interested in a thorough, hands-on overview of Apache Spark.
The course covers the core APIs for using Spark, fundamental mechanisms and basic internals of the framework, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs.
Each topic includes slide and lecture content along with hands-on use of Spark through an elegant web-based notebook environment. Inspired by tools like IPython/Jupyter, notebooks allow attendees to code jobs, data analysis queries, and visualizations using their own Spark cluster, accessed through a web browser. All class code is directly usable with pure open-source Spark or any commercial Spark distribution.
Objectives - after taking this class you will be able to:
- Describe Spark’s fundamental mechanics
- Use the core Spark APIs to operate on data
- Articulate and implement typical use cases for Spark
- Build data pipelines with SparkSQL and DataFrames
- Analyze Spark jobs using the UIs and logs
- Create Streaming and Machine Learning jobs
- Spark Overview
- RDD Fundamentals
- SparkSQL and DataFrames
- Spark Job Execution
- Cluster Architectures for Spark
- Intro to Spark Streaming
- Machine Learning Basics
Cost: $2500 per person
All participants will need a laptop with updated versions of Chrome or Firefox (Internet Explorer and Safari are not supported)
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks’ founders started the Spark research project at UC Berkeley that later became Apache Spark. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP. For more information, visit www.databricks.com.
Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.