Udacity

Intro to Hadoop and MapReduce (Udacity)

Offered by Udacity, Cloudera,

How to Process Big Data. The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. Learn the fundamental principles behind it, and how you can use its power to make sense of your Big Data.

Class Deals by MOOC List - Click here and see Udacity's Active Discounts, Deals, and Promo Codes.

What You Will Learn

Lesson 1
Big Data

What is Big Data?
The problems big data creates.
How Apache Hadoop addresses these problems.

Lesson 2
HDFS and MapReduce

Discover how HDFS distributes data over multiple computers.
Learn how MapReduce enables analyzing datasets in parallel across multiple machines.

Lesson 3
MapReduce code

Write your own MapReduce code.

Lesson 4
MapReduce Design Patterns

Use common patterns for MapReduce programs to analyze Udacity forum data.

What Will you learn:

How Hadoop fits into the world (recognize the problems it solves)
Understand the concepts of HDFS and MapReduce (find out how it solves the problems)
Write MapReduce programs (see how we solve the problems)
Practice solving problems on your own

Prerequisites and Requirements
Lesson 1 does not have technical prerequisites and is a good overview of Hadoop and MapReduce for managers.To get the most out of the class, however, you need basic programming skills in Python on a level provided by introductory courses like our Introduction to Computer Science course.

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Udacity

Intro to AJAX (Udacity)

CS: Programming Computer Science

Making Asynchronous Requests with jQuery. In this course you will learn how to make asynchronous requests with JavaScript (using jQuery’s AJAX functionality), and gain a better understanding of what’s actually happening when you do so. You will also learn how to use data APIs so you can take advantage of freely accessible data in your applications, including photo results, news articles and up-to-date data about the world around us.

Self Paced

Self-Paced

Programming AJAX JQuery

Udacity

Intro to Data Science (Udacity)

Statistics & Data Analysis Data Science

Learn what it takes to become a data scientist. The Introduction to Data Science class will survey the foundational topics in data science, namely: Data Manipulation; Data Analysis with Statistics and Machine Learning; Data Communication with Information Visualization; Data at Scale -- Working with Big Data.

Self Paced

Self-Paced

Statistics Machine Learning Big Data

Udacity

Design of Computer Programs (Udacity)

CS: Design & Product Computer Science

Programming Principles. Understanding how to approach programming problems and devise a solution is an essential skill for any Python developer. In this course, you’ll learn new concepts, patterns, and methods that will expand your coding abilities from programming expert, Peter Norvig.

Self Paced

Self-Paced

Programming Python Computer Programs

Udacity

Intro to jQuery (Udacity)

Computer Science

Manipulating Websites with Ease. jQuery is the most popular JavaScript library today, in use by over 60% of the top 100,000 most visited websites. This course will teach you how to use jQuery’s core features - DOM element selections, traversal and manipulation. You'll also learn how to read and make sense of jQuery's documentation, making it easy for you to go beyond the methods taught in this class and take advantage of jQuery's full array of features!

Self Paced

Self-Paced

Programming HTML Javascript

Udacity

C++ For Programmers (Udacity)

CS: Programming

Learn features and constructs for C++. C++ for Programmers is designed for students who are familiar with a programming language and wish to learn C++. This course focuses on 'how' as opposed to 'what'. For example, in the lesson on functions, we do not teach what a function is, but rather how to create a function in C++. The lessons are taught by several different instructors who have used C++ in their professional careers, so students get to experience different perspectives.

Self Paced

Self-Paced

Programming C++ Object-Oriented Programming

Udacity

Udacity,Google

Advanced Android with Kotlin (Udacity)

CS: Programming

Develop Feature-Rich Android Apps with the Kotlin Programming Language. Go beyond the basics of building an Android app with "Advanced Android with Kotlin". This course teaches you how to add a range of advanced features to your app, starting with best practices for using Android's notification system.

Self Paced

Self-Paced

Programming Android Android Apps

Udacity

Udacity,Google

Website Performance Optimization (Udacity)

CS: Software Engineering

The Critical Rendering Path. You will learn how to optimize any website for speed by diving into the details of how mobile and desktop browsers render pages. In this short course, you’ll learn about the Critical Rendering Path, or the set of steps browsers must take to convert HTML, CSS and JavaScript into living, breathing websites. From there, you’ll start exploring and experimenting with tools to measure performance and simple strategies to deliver the first pixels to the screen as early as possible.

Self Paced

Self-Paced

Programming HTML Javascript

Udacity

Model Building and Validation (Udacity)

Statistics & Data Analysis Data Science

Advanced Techniques for Analyzing Data. This course will teach you how to start from scratch in answering questions about the real world using data. Machine learning happens to be a small part of this process. The model building process involves setting up ways of collecting data, understanding and paying attention to what is important in the data to answer the questions you are asking, finding a statistical, mathematical or a simulation model to gain understanding and make predictions.

Self Paced

Self-Paced

Machine Learning Modeling Data Analysis

Udacity

HTML5 Canvas (Udacity)

CS: Software Engineering CS: Programming

From Pixels to Animation! Canvas is an HTML5 element which gives you drawable surface inside your web pages you can control with JavaScript. Powerful enough to use for compositing images and even creating games. In this course, through several sample projects, you’ll learn how to use the canvas; how to make compositions using shapes, images, and text; how to create effects and filters on images and how to create animations.

Self Paced

Self-Paced

Game Programming HTML

Udacity

Udacity,Insight

Spark (Udacity)

Data Science

Master how to work with big data and build machine learning models at scale using Spark! In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.

Self Paced

Self-Paced

Python Debugging Machine Learning

Udacity

Udacity,MongoDB University

Data Wrangling with MongoDB (Udacity)

CS: Software Engineering

In this course, we will explore how to wrangle data from diverse sources and shape it to enable data-driven applications. Some data scientists spend the bulk of their time doing this! Students will learn how to gather and extract data from widely used data formats. They will learn how to assess the quality of data and explore best practices for data cleaning. We will also introduce students to MongoDB, covering the essentials of storing data and the MongoDB query language together with exploratory analysis using the MongoDB aggregation framework.

Self Paced

Self-Paced

Programming MongoDB Databases

Udacity

Udacity,Twitter

Real-Time Analytics with Apache Storm (Udacity)

Statistics & Data Analysis Data Science

The world is trending in real time! Learn from Twitter to scalably process tweets, or any big data stream, in real-time to drive d3 visualizations using Apache Storm, the "Hadoop of Real Time." Storm is free, open source, and fun to use! Learn from Karthik Ramasamy, about the distributed, fault-tolerant, and flexible technology used to power Twitter’s real-time data flow pipeline. Twitter open sourced Storm in 2011, and it graduated to a top-level Apache project in September, 2014.

Self Paced

Self-Paced

Data Analysis Hadoop Data Science