Data for Machine Learning (Coursera)

Data for Machine Learning (Coursera)

This course is all about data and how it is critical to the success of your applied machine learning model.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Completing this course will give learners the skills to:

  • Understand the critical elements of data in the learning, training and operation phases
  • Understand biases and sources of data
  • Implement techniques to improve the generality of your model
  • Explain the consequences of overfitting and identify mitigation measures
  • Implement appropriate test and validation measures.
  • Demonstrate how the accuracy of your model can be improved with thoughtful feature engineering.
  • Explore the impact of the algorithm parameters on model strength

To be successful in this course, you should have at least beginner-level background in Python programming (e.g., be able to read and code trace existing code, be comfortable with conditionals, loops, variables, lists, dictionaries and arrays). You should have a basic understanding of linear algebra (vector notation) and statistics (probability distributions and mean/median/mode).
Course 3 of 4 in the Machine Learning: Algorithms in the Real World Specialization.

Syllabus

WEEK 1
What Does Good Data look like?
We all know that data is important for machine learning success, but what does it really look like? What steps do you need to take to get from scattered, unprocessed data to nice clean learning data? This week takes an overarching view to describe how your problem and data needs interact, and what processes need to be in place for successful data preparation.

WEEK 2
Preparing your Data for Machine Learning Success
Now that you have your data sources identified, you need to bring it all together. This week describes what you need to prepare data overall.

WEEK 3
Feature Engineering for MORE Fun & Profit
Data is particular to a problem. This week we'll discuss how to turn generic data into successful fuel for specific machine learning projects.

WEEK 4
Bad Data
There are so many ways data can go wrong! This week discussed some of the pitfalls in data identification and processing.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Hadoop Platform and Application Framework (Coursera) Coursera
University of California, San Diego

Hadoop Platform and Application Framework (Coursera)

This course is for novice programmers or business people who'd like to understand the core tools used to wrangle and analyze big data. With no prior experience, you'll have the opportunity to walk through hands-on examples with Hadoop and Spark frameworks, two of the most common in the industry. You will be comfortable explaining the specific components and basic processes of the Hadoop architecture, software stack, and execution environment.

Jun 1st 2026
5-12 Weeks
Data Science in Real Life (Coursera) Coursera
Johns Hopkins University

Data Science in Real Life (Coursera)

Have you ever had the perfect data science experience? The data pull went perfectly. There were no merging errors or missing data. Hypotheses were clearly defined prior to analyses. Randomization was performed for the treatment of interest. The analytic plan was outlined prior to analysis and followed exactly. The conclusions were clear and actionable decisions were obvious. Has that every happened to you? Of course not. Data analysis in real life is messy. How does one manage a team facing real data analyses? In this one-week course, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.

Jun 1st 2026
1 Week
Nearest Neighbor Collaborative Filtering (Coursera) Coursera
University of Minnesota

Nearest Neighbor Collaborative Filtering (Coursera)

In this course, you will learn the fundamental techniques for making personalized recommendations through nearest-neighbor techniques. First you will learn user-user collaborative filtering, an algorithm that identifies other people with similar tastes to a target user and combines their ratings to make recommendations for that user.

Jun 1st 2026
4 Weeks
Deep Learning for Business (Coursera) Coursera
Yonsei University

Deep Learning for Business (Coursera)

Your smartphone, smartwatch, and automobile (if it is a newer model) have AI (Artificial Intelligence) inside serving you every day. In the near future, more advanced “self-learning” capable DL (Deep Learning) and ML (Machine Learning) technology will be used in almost every aspect of your business and industry. So now is the right time to learn what DL and ML is and how to use it in advantage of your company. This course has three parts, where the first part focuses on DL and ML technology based future business strategy including details on new state-of-the-art products/services and open source DL software, which are the future enablers.

Jun 1st 2026
5-12 Weeks
Matrix Factorization and Advanced Techniques (Coursera) Coursera
University of Minnesota

Matrix Factorization and Advanced Techniques (Coursera)

In this course you will learn a variety of matrix factorization and hybrid machine learning techniques for recommender systems. Starting with basic matrix factorization, you will understand both the intuition and the practical details of building recommender systems based on reducing the dimensionality of the user-product preference space. Then you will learn about techniques that combine the strengths of different algorithms into powerful hybrid recommenders.

Jun 1st 2026
5-12 Weeks
Regression Modeling in Practice (Coursera) Coursera
Wesleyan University

Regression Modeling in Practice (Coursera)

This course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.

Jun 5th 2026
4 Weeks
Python Data Analysis (Coursera) Coursera
Rice University

Python Data Analysis (Coursera)

This course will continue the introduction to Python programming that started with Python Programming Essentials and Python Data Representations. We'll learn about reading, storing, and processing tabular data, which are common tasks. We will also teach you about CSV files and Python's support for reading and writing them. CSV files are a generic, plain text file format that allows you to exchange tabular data between different programs. These concepts and skills will help you to further extend your Python programming knowledge and allow you to process more complex data.

Jun 1st 2026
4 Weeks
Advanced Algorithms and Complexity (Coursera) Coursera
University of California, San Diego,Higher School of Economics - HSE University

Advanced Algorithms and Complexity (Coursera)

You've learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision.

Jun 1st 2026
5-12 Weeks
Python Programming Essentials (Coursera) Coursera
Rice University

Python Programming Essentials (Coursera)

This course will introduce you to the wonderful world of Python programming! We'll learn about the essential elements of programming and how to construct basic Python programs. We will cover expressions, variables, functions, logic, and conditionals, which are foundational concepts in computer programming. We will also teach you how to use Python modules, which enable you to benefit from the vast array of functionality that is already a part of the Python language. These concepts and skills will help you to begin to think like a computer programmer and to understand how to go about writing Python programs.

Jun 1st 2026
4 Weeks
Learn to code with AI (Coursera) Coursera
Scrimba

Learn to code with AI (Coursera)

Imagine waking up tomorrow as a web developer. What would you want to build? With AI tools like ChatGPT, you're already a developer, regardless of your experience, if you know how to work with them. So in this course, you'll build functional, interactive front-end projects while learning how to write effective prompts and debug and refine your code with the help of AI.

Jun 3rd 2026
2 Weeks
Introduction To Swift Programming (Coursera) Coursera
University of Toronto

Introduction To Swift Programming (Coursera)

Introduction to Swift Programming is the first course in a four part specialization series that will provide you with the tools and skills necessary to develop an iOS App from scratch. By the end of this first course you will be able to demonstrate intermediate application of programming in Swift, the powerful new programming language for iOS. Guided by best practices you will become proficient with syntax, object oriented principles, memory management, functional concepts and more in programming with Swift.

Jun 1st 2026
5-12 Weeks
Python Project for Data Science (Coursera) Coursera
IBM

Python Project for Data Science (Coursera)

This mini-course is intended to for you to demonstrate foundational Python skills for working with data. The completion of this course involves working on a hands-on project where you will develop a simple dashboard using Python. This course is part of the IBM Data Science Professional Certificate and the IBM Data Analytics Professional Certificate.

Jun 4th 2026
1 Week