Exploratory Data Analysis (Coursera)

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.
Completing this course will count towards your learning in any of the following programs:

What You Will Learn

  • Understand analytic graphics and the base plotting system in R
  • Use advanced graphing systems such as the Lattice system
  • Make graphical displays of very high dimensional data
  • Apply cluster analysis techniques to locate patterns in data

Syllabus

WEEK 1
This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already.

WEEK 2
This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process.

WEEK 3
This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R.

WEEK 4
This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Big Data Modeling and Management Systems (Coursera) Coursera
University of California, San Diego

Big Data Modeling and Management Systems (Coursera)

Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools.

Jun 8th 2026
5-12 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 8th 2026
4 Weeks
Understanding China, 1700-2000: A Data Analytic Approach, Part 1 (Coursera) Coursera
The Hong Kong University of Science and Technology - HKUST

Understanding China, 1700-2000: A Data Analytic Approach, Part 1 (Coursera)

The purpose of this course is to summarize new directions in Chinese history and social science produced by the creation and analysis of big historical datasets based on newly opened Chinese archival holdings, and to organize this knowledge in a framework that encourages learning about China in comparative perspective. Our course demonstrates how a new scholarship of discovery is redefining what is singular about modern China and modern Chinese history.

Jun 8th 2026
5-12 Weeks
Linear Regression and Modeling (Coursera) Coursera
Duke University

Linear Regression and Modeling (Coursera)

This course introduces simple and multiple linear regression models. These models allow you to assess the relationship between variables in a data set and a continuous response variable. Is there a relationship between the physical attractiveness of a professor and their student evaluation scores? Can we predict the test score for a child based on certain characteristics of his or her mother? In this course, you will learn the fundamental theory behind linear regression and, through data examples, learn to fit, examine, and utilize regression models to examine relationships between multiple variables, using the free statistical software R and RStudio.

Jun 8th 2026
4 Weeks
Business Intelligence Concepts, Tools, and Applications (Coursera) Coursera
University of Colorado System

Business Intelligence Concepts, Tools, and Applications (Coursera)

This is the fourth course in the Data Warehouse for Business Intelligence specialization. Ideally, the courses should be taken in sequence. In this course, you will gain the knowledge and skills for using data warehouses for business intelligence purposes and for working as a business intelligence developer. You’ll have the opportunity to work with large data sets in a data warehouse environment and will learn the use of MicroStrategy's Online Analytical Processing (OLAP) and Visualization capabilities to create visualizations and dashboards.

Jun 8th 2026
5-12 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 8th 2026
4 Weeks
Basic Statistics (Coursera) Coursera
University of Amsterdam

Basic Statistics (Coursera)

Understanding statistics is essential to understand research in the social and behavioral sciences. In this course you will learn the basics of statistics; not just how to calculate them, but also how to evaluate them. This course will also prepare you for the next course in the specialization - the course Inferential Statistics. In the first part of the course we will discuss methods of descriptive statistics. You will learn what cases and variables are and how you can compute measures of central tendency (mean, median and mode) and dispersion (standard deviation and variance). Next, we discuss how to assess relationships between variables, and we introduce the concepts correlation and regression.

Jun 8th 2026
5-12 Weeks
The Structured Query Language (SQL) (Coursera) Coursera
University of Colorado Boulder

The Structured Query Language (SQL) (Coursera)

In this course you will learn all about the Structured Query Language ("SQL".) We will review the origins of the language and its conceptual foundations. But primarily, we will focus on learning all the standard SQL commands, their syntax, and how to use these commands to conduct analysis of the data within a relational database. Our scope includes not only the SELECT statement for retrieving data and creating analytical reports, but also includes the DDL ("Data Definition Language") and DML ("Data Manipulation Language") commands necessary to create and maintain database objects.

Jun 9th 2026
5-12 Weeks
Communicating Data Science Results (Coursera) Coursera
University of Washington

Communicating Data Science Results (Coursera)

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Jun 8th 2026
3 Weeks
Introduction to Spreadsheets and Models (Coursera) Coursera
University of Pennsylvania

Introduction to Spreadsheets and Models (Coursera)

The simple spreadsheet is one of the most powerful data analysis tools that exists, and it’s available to almost anyone. Major corporations and small businesses alike use spreadsheet models to determine where key measures of their success are now, and where they are likely to be in the future. But in order to get the most out of a spreadsheet, you have know how to use it. This course is designed to give you an introduction to basic spreadsheet tools and formulas so that you can begin harness the power of spreadsheets to map the data you have now and to predict the data you may have in the future.

Jun 8th 2026
4 Weeks