MapReduce

Sort options

Machine Learning: Clustering & Retrieval (Coursera)

Case Studies: Finding Similar Documents. A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, [...]

Hadoop Platform and Application Framework (Coursera)

This course is for novice programmers or business people who'd like to understand the core tools used to wrangle and analyze big data. With no prior experience, you'll have the opportunity to walk through hands-on examples with Hadoop and Spark frameworks, two of the most common in the industry. [...]

Introduction to Big Data (Coursera)

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core [...]

Cloud Computing Concepts: Part 2 (Coursera)

Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies—all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing. Some of these concepts include: Clouds, MapReduce, key-value stores, Classical precursors, Widely-used [...]

Big Data Analysis Deep Dive (Coursera)

The job market for architects, engineers, and analytics professionals with Big Data expertise continues to increase. The Academy’s Big Data Career path focuses on the fundamental tools and techniques needed to pursue a career in Big Data. This course includes: data processing with python, writing and reading SQL queries, [...]

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade [...]

Cloud Computing Concepts, Part 1 (Coursera)

Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies—all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing. Some of these concepts include: clouds, MapReduce, key-value/NoSQL stores, classical distributed algorithms, [...]

Big Data for Agri-Food: Principles and Tools (edX)

As the big data era unfolds, developments in sensor and information technologies are evolving quickly. As a result, science and businesses are yielding enormous amounts of data. Yet, to reap the actionable business solutions data can unveil, we must learn to ask the right questions. Join Wageningen Wageningen University [...]

Big Data Analytics Using Spark (edX)

Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. In data science, data is called “big” if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or [...]