Machine Learning: Regression (Coursera)

Machine Learning: Regression (Coursera)

Case Study - Predicting Housing Prices. In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets.
By the end of this course, you will be able to:
-Describe the input and output of a regression model.
-Compare and contrast bias and variance when modeling data.
-Estimate model parameters using optimization algorithms.
-Tune parameters with cross validation.
-Analyze the performance of the model.
-Describe the notion of sparsity and how LASSO leads to sparse solutions.
-Deploy methods to select between models.
-Exploit the model to form predictions.
-Build a regression model to predict prices using a housing dataset.
-Implement these techniques in Python.
Course 2 of 4 in the Machine Learning Specialization.

Syllabus

WEEK 1
Welcome
Regression is one of the most important and broadly used machine learning and statistics tools out there. It allows you to make predictions from data by learning the relationship between features of your data and some observed, continuous-valued response. Regression is used in a massive number of applications ranging from predicting stock prices to understanding gene regulatory networks.This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have.
Simple Linear Regression
Our course starts from the most basic regression model: Just fitting a line to data. This simple model for forming predictions from a single, univariate feature of the data is appropriately called "simple linear regression". In this module, we describe the high-level regression task and then specialize these concepts to the simple linear regression case. You will learn how to formulate a simple regression model and fit the model to data using both a closed-form solution as well as an iterative optimization algorithm called gradient descent. Based on this fitted function, you will interpret the estimated model parameters and form predictions. You will also analyze the sensitivity of your fit to outlying observations. You will examine all of these concepts in the context of a case study of predicting house prices from the square feet of the house.

WEEK 2
Multiple Regression
The next step in moving beyond simple linear regression is to consider "multiple regression" where multiple features of the data are used to form predictions. More specifically, in this module, you will learn how to build models of more complex relationship between a single variable (e.g., 'square feet') and the observed response (like 'house sales price'). This includes things like fitting a polynomial to your data, or capturing seasonal changes in the response value. You will also learn how to incorporate multiple input variables (e.g., 'square feet', '# bedrooms', '# bathrooms'). You will then be able to describe how all of these models can still be cast within the linear regression framework, but now using multiple "features". Within this multiple regression framework, you will fit models to data, interpret estimated coefficients, and form predictions. Here, you will also implement a gradient descent algorithm for fitting a multiple regression model.

WEEK 3
Assessing Performance
Having learned about linear regression models and algorithms for estimating the parameters of such models, you are now ready to assess how well your considered method should perform in predicting new data. You are also ready to select amongst possible models to choose the best performing. This module is all about these important topics of model selection and assessment. You will examine both theoretical and practical aspects of such analyses. You will first explore the concept of measuring the "loss" of your predictions, and use this to define training, test, and generalization error. For these measures of error, you will analyze how they vary with model complexity and how they might be utilized to form a valid assessment of predictive performance. This leads directly to an important conversation about the bias-variance tradeoff, which is fundamental to machine learning. Finally, you will devise a method to first select amongst models and then assess the performance of the selected model. The concepts described in this module are key to all machine learning problems, well-beyond the regression setting addressed in this course.

WEEK 4
Ridge Regression
You have examined how the performance of a model varies with increasing model complexity, and can describe the potential pitfall of complex models becoming overfit to the training data. In this module, you will explore a very simple, but extremely effective technique for automatically coping with this issue. This method is called "ridge regression". You start out with a complex model, but now fit the model in a manner that not only incorporates a measure of fit to the training data, but also a term that biases the solution away from overfitted functions. To this end, you will explore symptoms of overfitted functions and use this to define a quantitative measure to use in your revised optimization objective. You will derive both a closed-form and gradient descent algorithm for fitting the ridge regression objective; these forms are small modifications from the original algorithms you derived for multiple regression. To select the strength of the bias away from overfitting, you will explore a general-purpose method called "cross validation". You will implement both cross-validation and gradient descent to fit a ridge regression model and select the regularization constant.

WEEK 5
Feature Selection & Lasso
A fundamental machine learning task is to select amongst a set of features to include in a model. In this module, you will explore this idea in the context of multiple regression, and describe how such feature selection is important for both interpretability and efficiency of forming predictions. To start, you will examine methods that search over an enumeration of models including different subsets of features. You will analyze both exhaustive search and greedy algorithms. Then, instead of an explicit enumeration, we turn to Lasso regression, which implicitly performs feature selection in a manner akin to ridge regression: A complex model is fit based on a measure of fit to the training data plus a measure of overfitting different than that used in ridge. This lasso method has had impact in numerous applied domains, and the ideas behind the method have fundamentally changed machine learning and statistics. You will also implement a coordinate descent algorithm for fitting a Lasso model. Coordinate descent is another, general, optimization technique, which is useful in many areas of machine learning.

WEEK 6
Nearest Neighbors & Kernel Regression
Up to this point, we have focused on methods that fit parametric functions---like polynomials and hyperplanes---to the entire dataset. In this module, we instead turn our attention to a class of "nonparametric" methods. These methods allow the complexity of the model to increase as more data are observed, and result in fits that adapt locally to the observations. We start by considering the simple and intuitive example of nonparametric methods, nearest neighbor regression: The prediction for a query point is based on the outputs of the most related observations in the training set. This approach is extremely simple, but can provide excellent predictions, especially for large datasets. You will deploy algorithms to search for the nearest neighbors and form predictions based on the discovered neighbors. Building on this idea, we turn to kernel regression. Instead of forming predictions based on a small set of neighboring observations, kernel regression uses all observations in the dataset, but the impact of these observations on the predicted value is weighted by their similarity to the query point. You will analyze the theoretical performance of these methods in the limit of infinite training data, and explore the scenarios in which these methods work well versus struggle. You will also implement these techniques and observe their practical behavior.
In the conclusion of the course, we will recap what we have covered. This represents both techniques specific to regression, as well as foundational machine learning concepts that will appear throughout the specialization. We also briefly discuss some important regression techniques we did not cover in this course. We conclude with an overview of what's in store for you in the rest of the specialization.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Analysis of Algorithms (Coursera) Coursera
Princeton University

Analysis of Algorithms (Coursera)

This course teaches a calculus that enables precise quantitative predictions of large combinatorial structures. In addition, this course covers generating functions and real asymptotics and then introduces the symbolic method in the context of applications in the analysis of algorithms and basic structures such as permutations, trees, strings, words, and mappings.

Jul 6th 2026
5-12 Weeks
Pattern Discovery in Data Mining (Coursera) Coursera
University of Illinois at Urbana-Champaign

Pattern Discovery in Data Mining (Coursera)

Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern discovery in data mining. We will also introduce methods for data-driven phrase mining and some interesting applications of pattern discovery. This course provides you the opportunity to learn skills and content to practice and engage in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of patterns, sequential patterns, and sub-graph patterns.

Jul 6th 2026
4 Weeks
Practical Machine Learning (Coursera) Coursera
Johns Hopkins University

Practical Machine Learning (Coursera)

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates.

Jul 6th 2026
4 Weeks
Cloud Computing Concepts, Part 1 (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Computing Concepts, Part 1 (Coursera)

Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies—all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing. Some of these concepts include: clouds, MapReduce, key-value/NoSQL stores, classical distributed algorithms, widely-used distributed algorithms, scalability, trending areas, and much, much more!

Jul 6th 2026
5-12 Weeks
Machine Learning: Classification (Coursera) Coursera
University of Washington

Machine Learning: Classification (Coursera)

Case Studies: Analyzing Sentiment & Loan Default Prediction. In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank.

Jul 6th 2026
5-12 Weeks
The Raspberry Pi Platform and Python Programming for the Raspberry Pi (Coursera) Coursera
University of California, Irvine

The Raspberry Pi Platform and Python Programming for the Raspberry Pi (Coursera)

The Raspberry Pi is a small, affordable single-board computer that you will use to design and develop fun and practical IoT devices while learning programming and computer hardware. In addition, you will learn how to set up up the Raspberry Pi environment, get a Linux operating system running, and write and execute some basic Python code on the Raspberry Pi. You will also learn how to use Python-based IDE (integrated development environments) for the Raspberry Pi and how to trace and debug Python code on the device.

Jul 6th 2026
4 Weeks
Approximation Algorithms Part I (Coursera) Coursera
École normale supérieure

Approximation Algorithms Part I (Coursera)

How efficiently can you pack objects into a minimum number of boxes? How well can you cluster nodes so as to cheaply separate a network into components around a few centers? These are examples of NP-hard combinatorial optimization problems. It is most likely impossible to solve such problems efficiently, so our aim is to give an approximate solution that can be computed in polynomial time and that at the same time has provable guarantees on its cost relative to the optimum.

Jul 6th 2026
5-12 Weeks
Advanced Algorithms and Complexity (Coursera) Coursera
University of California, San Diego,Higher School of Economics - HSE University

Advanced Algorithms and Complexity (Coursera)

You've learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision.

Jul 6th 2026
5-12 Weeks
Regression Models (Coursera) Coursera
Johns Hopkins University

Regression Models (Coursera)

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models.

Jul 6th 2026
4 Weeks
Object Oriented Programming in Java (Coursera) Coursera
University of California, San Diego

Object Oriented Programming in Java (Coursera)

Welcome to our course on Object Oriented Programming in Java using data visualization. People come to this course with many different goals -- and we are really excited to work with all of you! Some of you want to be professional software developers, others want to improve your programming skills to implement that cool personal project that you’ve been thinking about, while others of you might not yet know why you’re here and are trying to figure out what this course is all about.

Jul 6th 2026
5-12 Weeks
Text Retrieval and Search Engines (Coursera) Coursera
University of Illinois at Urbana-Champaign

Text Retrieval and Search Engines (Coursera)

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Jul 6th 2026
5-12 Weeks