Hadoop fundamentals
Cognitive Class training, MOOC (2020). This learning path presents Hadoop, which is an open source framework for distributed storage and processing of big data. The training covers content that is critical to anyone's success in this realm by explaining the Hadoop conceptual design, introducing MapReduce, YARN (Yet Another Resource Negotiator) and Hive, then explaining how to use Hadoop and manipulate data without the use of complex coding.
Course 1: Hadoop 101
Main topics:
- Introduction to Hadoop;
- Hadoop architecture and HDFS;
- Hadoop administration;
- Hadoop components.
Course 2: MapReduce and YARN
Main topics:
- Introduction to MapReduce and YARN;
- Limitations of Hadoop v1 and MapReduce v1;
- YARN architecture.
Course 3: Moving data into Hadoop
Main topics:
- Load scenarios;
- Using Sqoop;
- Flume overview;
- Using Data Click.
Course 4: Accessing Hadoop data using Hive
Main topics:
- Introduction to Hive;
- Hive DDL - Data Definition Language;
- Hive DML - Data Manipulation Language;
- Hive operators and functions.
References
Training
Hadoop 101 (course certificate)
Hadoop Foundations – Level 1 (certification badge)
MapReduce and YARN (course certificate)
Hadoop Programming – Level 1 (certification badge)
Moving data into Hadoop (course certificate)
Hadoop Administration – Level 1 (certification badge)
Accessing Hadoop data using Hive (course certificate)
Hadoop Data Access – Level 1 (certification badge)
Hadoop Foundations – Level 2 (certification badge)
Related articles
Spark fundamentals (Cognitive Class training)
Data science specialization (Coursera training)