Development

The introduction to Apache Spark

DURATION 2 Days

Overview

Apache Spark is a framework for fast processing a large amount of data. It is extremely useful for system architects, development engineers and business analysts, due to the ability to use it for any kind of data processing in any kind of environment.

Apache Spark is a very popular system, often used for advanced analytics, data science, modern BigData architecture, as well as for the complex batch (ETL) processing and for processing in real time.

Spark contains a few key components such as: Spark SQL for data structuring, Spark Streaming for processing a large amount of data in real time, Spark MLib for machine learning, Spark GraphX for graph processing and SparkR for statistic data processing using R language.

Spark can be started by itself, on a YARN (Hadoop) cluster or in a Mesos environment, so it can start in any environment. Spark is a polyglot framework, which means that it abstracts its usage to the maximum, and it imposes using a programme language (Python, Java, Scala, R), to the development environment, which fits the organization or the business type the best. All the examples in this education will be primarily processed in Python, but other programme languages, e.g. Scala, will also be used. The exercise will be done in the independent and cluster environment, depending on the assignment the participants will be working on.

Target audience

System architects
Development engineers
Business analysts.

Prerequisites

Basic knowledge of the Python
Knowledge of OO programming
Advanced knowledge of the SQL language

Content

The participants will get a brief introduction to Spark at the course, as well as the basic explanation of how Spark functions, and will, through interactive examples, go through an advanced analytics assignment and work on the target DataSet from the big data set download to the final visualization.

For all possible inquiries, do not hesitate to contact us on our e-mail address learn@croz.net

Check course dates.

Get in touch

Not sure where to start? Let our experts guide you. Send us your query through this contact form.

Get in touch

Contact us for all inquiries regarding services and general information

Use the form below to apply for course

Get in touch

Contact us for all inquiries regarding services and general information

The introduction to Apache Spark

The introduction to Apache Spark

Data & Machine Learning

Data & Machine Learning

Data & Machine Learning