How to Process Big Data. The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. Learn the fundamental principles behind it, and how you can use its power to make sense of your Big Data.
Class Deals by MOOC List - Click here and see Udacity's Active Discounts, Deals, and Promo Codes.
What You Will Learn
Lesson 1
Big Data
- What is Big Data?
- The problems big data creates.
- How Apache Hadoop addresses these problems.
Lesson 2
HDFS and MapReduce
- Discover how HDFS distributes data over multiple computers.
- Learn how MapReduce enables analyzing datasets in parallel across multiple machines.
Lesson 3
MapReduce code
- Write your own MapReduce code.
Lesson 4
MapReduce Design Patterns
- Use common patterns for MapReduce programs to analyze Udacity forum data.
What Will you learn:
- How Hadoop fits into the world (recognize the problems it solves)
- Understand the concepts of HDFS and MapReduce (find out how it solves the problems)
- Write MapReduce programs (see how we solve the problems)
- Practice solving problems on your own
Prerequisites and Requirements
Lesson 1 does not have technical prerequisites and is a good overview of Hadoop and MapReduce for managers.To get the most out of the class, however, you need basic programming skills in Python on a level provided by introductory courses like our Introduction to Computer Science course.