Statistical Inference and Hypothesis Testing in Data Science Applications (Coursera)

Statistical Inference and Hypothesis Testing in Data Science Applications (Coursera)

This course will focus on theory and implementation of hypothesis testing, especially as it relates to applications in data science. Students will learn to use hypothesis tests to make informed decisions from data. Special attention will be given to the general logic of hypothesis testing, error and error rates, power, simulation, and the correct computation and interpretation of p-values. Attention will also be given to the misuse of testing concepts, especially p-values, and the ethical implications of such misuse.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics.

What You Will Learn

  • Define a composite hypothesis and the level of significance for a test with a composite null hypothesis.
  • Define a test statistic, level of significance, and the rejection region for a hypothesis test. Give the form of a rejection region.
  • Perform tests concerning a true population variance.
  • Compute the sampling distributions for the sample mean and sample minimum of the exponential distribution.

Course 3 of 3 in the Data Science Foundations: Statistical Inference Specialization

Syllabus

WEEK 1
Fundamental Concepts of Hypothesis Testing
In this module, we will define a hypothesis test and develop the intuition behind designing a test. We will learn the language of hypothesis testing, which includes definitions of a null hypothesis, an alternative hypothesis, and the level of significance of a test. We will walk through a very simple test.

WEEK 2
Composite Tests, Power Functions, and P-Values
In this module, we will expand the lessons of Module 1 to composite hypotheses for both one and two-tailed tests. We will define the “power function” for a test and discuss its interpretation and how it can lead to the idea of a “uniformly most powerful” test. We will discuss and interpret “p-values” as an alternate approach to hypothesis testing.

WEEK 3
t-Tests and Two-Sample Tests
In this module, we will learn about the chi-squared and t distributions and their relationships to sampling distributions. We will learn to identify when hypothesis tests based on these distributions are appropriate. We will review the concept of sample variance and derive the “t-test”. Additionally, we will derive our first two-sample test and apply it to make some decisions about real data.

WEEK 4
Beyond Normality
In this module, we will consider some problems where the assumption of an underlying normal distribution is not appropriate and will expand our ability to construct hypothesis tests for this case. We will define the concept of a “uniformly most powerful” (UMP) test, whether or not such a test exists for specific problems, and we will revisit some of our earlier tests from Modules 1 and 2 through the UMP lens. We will also introduce the F-distribution and its role in testing whether or not two population variances are equal.

WEEK 5
Likelihood Ratio Tests and Chi-Squared Tests
In this module, we develop a formal approach to hypothesis testing, based on a “likelihood ratio” that can be more generally applied than any of the tests we have discussed so far. We will pay special attention to the large sample properties of the likelihood ratio, especially Wilks’ Theorem, that will allow us to come up with approximate (but easy) tests when we have a large sample size. We will close the course with two chi-squared tests that can be used to test whether the distributional assumptions we have been making throughout this course are valid.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Managing Data Analysis (Coursera) Coursera
Johns Hopkins University

Managing Data Analysis (Coursera)

This one-week course describes the process of analyzing data and how to manage that process. We describe the iterative nature of data analysis and the role of stating a sharp question, exploratory data analysis, inference, formal statistical modeling, interpretation, and communication. In addition, we will describe how to direct analytic activities within a team and to drive the data analysis process towards coherent and useful results.

Jun 15th 2026
1 Week
Designing, Running, and Analyzing Experiments (Coursera) Coursera
University of California, San Diego

Designing, Running, and Analyzing Experiments (Coursera)

You may never be sure whether you have an effective user experience until you have tested it with users. In this course, you’ll learn how to design user-centered experiments, how to run such experiments, and how to analyze data from these experiments in order to evaluate and validate user experiences. You will work through real-world examples of experiments from the fields of UX, IxD, and HCI, understanding issues in experiment design and analysis.

Jun 15th 2026
5-12 Weeks
A Crash Course in Data Science (Coursera) Coursera
Johns Hopkins University

A Crash Course in Data Science (Coursera)

By now you have definitely heard about data science and big data. In this one-week class, we will provide a crash course in what these terms mean and how they play a role in successful organizations. This class is for anyone who wants to learn what all the data science action is about, including those who will eventually need to manage data scientists. The goal is to get you up to speed as quickly as possible on data science without all the fluff. We've designed this course to be as convenient as possible without sacrificing any of the essentials.

Jun 15th 2026
1 Week
Building R Packages (Coursera) Coursera
Johns Hopkins University

Building R Packages (Coursera)

Writing good code for data science is only part of the job. In order to maximizing the usefulness and reusability of data science software, code must be organized and distributed in a manner that adheres to community-based standards and provides a good user experience. This course covers the primary means by which R software is organized and distributed to others.

Jun 15th 2026
4 Weeks
Building a Data Science Team (Coursera) Coursera
Johns Hopkins University

Building a Data Science Team (Coursera)

Data science is a team sport. As a data science executive it is your job to recruit, organize, and manage the team to success. In this one-week course, we will cover how you can find the right people to fill out your data science team, how to organize them to give them the best chance to feel empowered and successful, and how to manage your team as it grows.

Jun 15th 2026
1 Week
Genomic Data Science and Clustering (Bioinformatics V) (Coursera) Coursera
University of California, San Diego

Genomic Data Science and Clustering (Bioinformatics V) (Coursera)

How do we infer which genes orchestrate various processes in the cell? How did humans migrate out of Africa and spread around the world? In this class, we will see that these two seemingly different questions can be addressed using similar algorithmic and machine learning techniques arising from the general problem of dividing data points into distinct clusters.

Jun 15th 2026
3 Weeks
Strategic Planning and Execution (Coursera) Coursera
University of Virginia

Strategic Planning and Execution (Coursera)

Avoid the pitfalls of strategy planning and execution with the tools and skills from this course. You'll learn the pillars of strategy execution--analysis, formulation, and implementation--and how to use the 4A model to effectively approach strategy execution. Finally, a panel of leaders from entrepreneurs, nonprofits, and industry, share their expertise gleaned from years of successful strategy planning and execution.

Jun 15th 2026
4 Weeks
Nearest Neighbor Collaborative Filtering (Coursera) Coursera
University of Minnesota

Nearest Neighbor Collaborative Filtering (Coursera)

In this course, you will learn the fundamental techniques for making personalized recommendations through nearest-neighbor techniques. First you will learn user-user collaborative filtering, an algorithm that identifies other people with similar tastes to a target user and combines their ratings to make recommendations for that user.

Jun 15th 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 15th 2026
4 Weeks