SQL is a standard language for retrieving data in databases. It is mostly used by Software Engineers, Database Administrators, Business Analysts and Data Scientists.
Goal of this course is to give introduction to basic SQL concepts and to show how they are implemented in Big Data technologies.
Examples and exercises will be done through interactive notebook Apache Zeppelin using both standard SQL syntax and Spark SQL. Zeppelin is web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Spark and more.
Apache Spark is a framework for fast processing of large amount of data, often used for advanced analytics, data science, complex batch (ETL) processing and for real time processing.
This course is made for the following audience:
- Data and Business Analysts who need to connect to the Hadoop databases and use Spark SQL processing engine for retrieving and processing data
- Developers who should understand how to use SQL using Big Data technology like HDFS, Hive, Spark
- Data Science team who needs to understand batter SQL language
This course focuses on:
- Introduction to Databases and lite Spark
- Simple SQL Queries
- Retrieving Data from Multiple Tables
- Scalar Functions and Arithmetic
- Summary Functions and Grouping
Duration: 2 days
Prerequisite for SQL Big Data Edition: Basic knowledge of database concepts. This course is mixture of basic and advanced SQL course in Big Data environment.