Introduction to Big Data

Training table

Big Data

4th of October

13:05-16:00 Local Time

Apply now

Big Data

4th of October

10:05-13:00 United Kingdom Time

Apply now

Big Data

4 Oktober

11:05-14:00 German Time

Apply now

Большие данные

4 Oктября

12:05-15:00 Russian Time

Apply now

Information about the training

During the training, you will be introduced to the use of a number of tools including the Apache Hadoop architecture, HDFS, HBase, Spark, Yarn, Apache Spark and MapReduce. They will also master Spark's RDDs, datasets, optimizing SparkSQL, and working with Spark's development and runtime environment options, including the basics of parallel programming with SparkSQL for DataFrames and Data.

Who is this training for?

Employees of companies experiencing a Big Data problem,
Professionals who apply Data Science and want to perform analyzes on Big Data, those who want to develop themselves in the field of Data Science, programmers who want to learn Deep learning executives and specialists who do not have knowledge and experience in this field but want to be informed about the possibilities of Big Data and programmers and data engineers.

Certificate

Those who successfully complete the training will receive a Certified Big Data for Data Science certificate and others will receive a participation certificate. You can see a sample certificate on the right.

Demonstration lesson

Big Data in 5 minutes

Lesson

What Is Big Data? Big Data Analytics, Big Data Tutorial.

Trainer

Simplilearn

Information

This video, Big Data In 5 Minutesby Simplilearn, will help you understand what is Big Data, the 5 V's of Big Data, why Hadoop came into existence, and what Hadoop is.

Syllabus

Session 1

Operating Systems and Linux (Ubuntu)
What is Version Control and Git?
Managing Projects with GitHub

Session 2

Introduction to Big Data
Introduction to Hadoop
Hadoop Distributed File System (HDFS)
MapReduce Process with MrJob Library

Case Study 1

MapReduce of Data with MrJob Library.

Session 3

Hadoop Ecosystem - Hive, Pig, HBase
Hadoop Ecosystem - Zookeeper, YARN, Ambari
Applying MapReduce Code on Hadoop Ecosystem

Case Study 2

MapReduce Process in the Hadoop Ecosystem, working with real airport data.

Session 4

Introduction to Spark
RDD-based Programming
Transformations and Actions
Datasets and Databases
Introduction to SparkSQL and Spark Structured Streaming

Case Study 3

Building Machine Learning Models and Measuring Model Performance using PySpark.

Session 5

Cloud Computing for Big Data: Introduction to AWS Practical and AWS EC2, and Other AWS Services
NoSQL Databases and MongoDB

Case Study 4

Creating a Virtual System on an AWS Server with Python.
Managing NoSQL Databases with PyMongo.

Trainers

Cəlal Rəhmanov

Data Elmi üzrə Ekspert, Kapital bank ASC

With over 4 years of experience in the field of data science, Jalal Rahmanov currently serves as a Data Science Expert at Kapital Bank’s Micro Business Tribe. He was previously a Data Scientist specializing in NLP and AI at Kapital Bank’s Center of Excellence team, where he contributed to international projects presented by foreign experts in European countries such as Germany.

Prior to this, he held the position of CVM and BI - Junior Data Scientist at Yelo Bank. Jalal has both onsite and remote work experience with local and international companies, including Azerbaijan Artificial Intelligence Laboratory, Pasha Bank, and The Sparks Foundation.

He possesses practical skills in the implementation and integration of technologies such as Python, Tableau, Dataiku, SQL, GitLab, Docker, SparkMLlib, and Kafka.