

Big Data and Advanced Analytics
Overview
The purpose of this course is to explain all the key elements when setting up big data environment for the purposes of advanced analytics.
Through the course, the participants will learn about the basic concepts of big data methodology and technology. The course includes some basic terms, such as predictive analytics, text processing and sentiment analysis. The participants will get a detailed insight into data-at-rest technologies, such as Hadoop and noSQL bases for batch processing a large amount of data. Other than that, data-in-motion technologies for processing data in real time (real-time, Streaming, IoT,…) will also be explained.
Target audience
- Architects
- Business Analysts
- Data Scientists
- Data Engineers
- Integration development process engineers
Prerequisites
- The prerequisite for this course is that the participants are familiar with the concepts and architectures of data management.
Content
- Day 1: The introduction to big data environment
- What is advanced analytics?
- The concepts of advanced analytics (predictive, sentiment, …)
- How to establish a Data Science environment?
- Data quality in a big data environment
- Data visualization
- Day 2: Data-at-rest – managing all the available data
- General architecture for big data solutions and „Data Lake“ (data-at-rest overview)
- The introduction to basic technologies for big data environment (Hadoop and Spark)
- What is the „Polyglot persistence“ architecture and how to use NoSQL data bases?
- How to modernize the existing architecture for the needs of the advanced analytics?
- Program storage devices (DWH appliance machine)
- Exercise: building an application for big data processing
- Day 3: Data-in-motion – managing data in real time
- What is real-time, streaming and sensor data?
- Lambda architecture – how to adapt batch and real-time analytics?
- General architecture for big data solutions (data-in-motion overview)
- Exercise: building an application for real-time processing
Duration:
The course is divided into three days and there is a possibility of arranging the course by certain units:
- Day 1: The introduction to big data technologies
- Day 2: Architecture, technology and development of big data managing
- Day 3: Architecture, technology and development of data managing in real time.
It is possible to participate in only one or two days or the whole three-day course.
For all possible inquiries, do not hesitate to contact us on our e-mail address learn@croz.net