Coursera

Introduction to Big Data (Coursera)

Offered by University of California, San Diego,

Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

At the end of this course, you will be able to:

Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.
Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
Get value out of Big Data by using a 5-step process to structure your analysis.
Identify what are and what are not big data problems and be able to recast big data problems as data science questions.
Provide an explanation of the architectural components and programming models used for scalable big data analysis.
Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.
Install and run a program using Hadoop!

This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments.
Course 1 of 6 in the Big Data Specialization.
Hardware Requirements:
(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.
Software Requirements:
This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

Syllabus

WEEK 1
Welcome
Welcome to the Big Data Specialization! We're excited for you to get to know us and we're looking forward to learning about you!
Big Data: Why and Where
Data -- it's been around (even digitally) for a while. What makes data "big" and where does this big data come from?

WEEK 2
Characteristics of Big Data and Dimensions of Scalability
You may have heard of the "Big Vs". We'll give examples and descriptions of the commonly discussed 5. But, we want to propose a 6th V and we'll ask you to practice writing Big Data questions targeting this V -- value.
Data Science: Getting Value out of Big Data
We love science and we love computing, don't get us wrong. But the reality is we care about Big Data because it can bring value to our companies, our lives, and the world. In this module we'll introduce a 5 step process for approaching data science problems.

WEEK 3
Foundations for Big Data Systems and Programming
Big Data requires new programming frameworks and systems. For this course, we don't programming knowledge or experience -- but we do want to give you a grounding in some of the key concepts.
Systems: Getting Started with Hadoop
Let's look at some details of Hadoop and MapReduce. Then we'll go "hands on" and actually perform a simple MapReduce task in the Cloudera VM. Pay attention - as we'll guide you in "learning by doing" in diagramming a MapReduce task as a Peer Review.

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Coursera

Johns Hopkins University

Exploratory Data Analysis (Coursera)

Statistics & Data Analysis Data Science

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 29th 2026

4 Weeks

Statistics Data Analysis Data Science

Coursera

Google

Process Data from Dirty to Clean (Coursera)

Statistics & Data Analysis Data Science

This is the fourth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. In this course, you’ll continue to build your understanding of data analytics and the concepts and tools that data analysts use in their work. You’ll learn how to check and clean your data using spreadsheets and SQL as well as how to verify and report your data cleaning results. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 30th 2026

5-12 Weeks

Data Databases SQL

Coursera

MathWorks

Data Science Companion (Coursera)

Statistics & Data Analysis Data Science

The Data Science Companion provides an introduction to data science. You will gain a quick background in data science and core machine learning concepts, such as regression and classification. You’ll be introduced to the practical knowledge of data processing and visualization using low-code solutions, as well as an overview of the ways to integrate multiple tools effectively to solve data science problems.

Jul 3rd 2026

4 Weeks

MATLAB Machine Learning Regression

Coursera

ESSEC Business School

Foundations of marketing analytics (Coursera)

Marketing & Communication Business

Who is this course for? This course is designed for students, business analysts, and data scientists who want to apply statistical knowledge and techniques to business contexts. For example, it may be suited to experienced statisticians, analysts, engineers who want to move more into a business role, in particular in marketing. You will find this course exciting and rewarding if you already have a background in statistics, can use R or another programming language and are familiar with databases and data analysis techniques such as regression, classification, and clustering. However, it contains a number of recitals and R Studio tutorials which will consolidate your competences, enable you to play more freely with data and explore new features and statistical functions in R.

Jun 29th 2026

5-12 Weeks

Databases Big Data Data Analysis

Coursera

Johns Hopkins University

Statistical Inference (Coursera)

Statistics & Data Analysis Data Science

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 29th 2026

4 Weeks

Statistics Probability Data Analysis

Coursera

University of Illinois at Urbana-Champaign

Cloud Computing Concepts: Part 2 (Coursera)

CS: Software Engineering CS: Theory

Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies—all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing. Some of these concepts include: Clouds, MapReduce, key-value stores, Classical precursors, Widely-used algorithms, Classical algorithms, Scalability, Trending areas, And more!

Jun 29th 2026

5-12 Weeks

Programming Cloud Algorithms

Coursera

Johns Hopkins University

Regression Models (Coursera)

Statistics & Data Analysis Data Science

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jun 29th 2026

4 Weeks

Statistics Regression Linear Regression

Coursera

Johns Hopkins University

Reproducible Research (Coursera)

Statistics & Data Analysis Data Science

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations.

Jun 29th 2026

4 Weeks

Data Analysis Data Science Reproducible Research

Coursera

Johns Hopkins University

Data Science in Real Life (Coursera)

Management & Leadership Statistics & Data Analysis

Have you ever had the perfect data science experience? The data pull went perfectly. There were no merging errors or missing data. Hypotheses were clearly defined prior to analyses. Randomization was performed for the treatment of interest. The analytic plan was outlined prior to analysis and followed exactly. The conclusions were clear and actionable decisions were obvious. Has that every happened to you? Of course not. Data analysis in real life is messy. How does one manage a team facing real data analyses? In this one-week course, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.

Jun 29th 2026

1 Week

Statistics Machine Learning Data Management

Coursera

University of Illinois at Urbana-Champaign

Digital Marketing Analytics in Practice (Coursera)

Marketing & Communication Business

Successfully marketing brands today requires a well-balanced blend of art and science. This course introduces students to the science of web analytics while casting a keen eye toward the artful use of numbers found in the digital space. The goal is to provide the foundation needed to apply data analytics to real-world challenges marketers confront daily. Students will learn to identify the web analytic tool right for their specific needs; understand valid and reliable ways to collect, analyze, and visualize data from the web; and utilize data in decision making for agencies, organizations or clients.

Jun 29th 2026

4 Weeks

Marketing Data Analysis Digital Analytics

Coursera

Google

Foundations: Data, Data, Everywhere (Coursera)

Statistics & Data Analysis Data Science

This is the first course in the Google Data Analytics Certificate. These courses will equip you with the skills you need to apply to introductory-level data analyst jobs. Organizations of all kinds need data analysts to help them improve their processes, identify opportunities and trends, launch new products, and make thoughtful decisions. In this course, you’ll be introduced to the world of data analytics through hands-on curriculum developed by Google. The material shared covers plenty of key data analytics topics, and it’s designed to give you an overview of what’s to come in the Google Data Analytics Certificate. Current Google data analysts will instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 30th 2026

5-12 Weeks

Data SQL Spreadsheets

Coursera

Google

Analyze Data to Answer Questions (Coursera)

Statistics & Data Analysis Data Science

This is the fifth course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. In this course, you’ll explore the “analyze” phase of the data analysis process. You’ll take what you’ve learned to this point and apply it to your analysis to make sense of the data you’ve collected. You’ll learn how to organize and format your data using spreadsheets and SQL to help you look at and think about your data in different ways. You’ll also find out how to perform complex calculations on your data to complete business objectives.

Jun 30th 2026

4 Weeks

Data Databases SQL