AI Workflow: Feature Engineering and Bias Detection (Coursera)

Offered by IBM,
AI Workflow: Feature Engineering and Bias Detection (Coursera)

This is the third course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones. Course 3 introduces you to the next stage of the workflow for our hypothetical media company. In this stage of work you will learn best practices for feature engineering, handling class imbalances and detecting bias in the data.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Class imbalances can seriously affect the validity of your machine learning models, and the mitigation of bias in data is essential to reducing the risk associated with biased models. These topics will be followed by sections on best practices for dimension reduction, outlier detection, and unsupervised learning techniques for finding patterns in your data. The case studies will focus on topic modeling and data visualization.
By the end of this course you will be able to:

  1. Employ the tools that help address class and class imbalance issues
  2. Explain the ethical considerations regarding bias in data
  3. Employ ai Fairness 360 open source libraries to detect bias in models
  4. Employ dimension reduction techniques for both EDA and transformations stages
  5. Describe topic modeling techniques in natural language processing
  6. Use topic modeling and visualization to explore text data
  7. Employ outlier handling best practices in high dimension data
  8. Employ outlier detection algorithms as a quality assurance tool and a modeling tool
  9. Employ unsupervised learning techniques using pipelines as part of the AI workflow
  10. Employ basic clustering algorithms

Course 3 of 6 in the IBM AI Enterprise Workflow Specialization.

Syllabus

WEEK 1
Data transforms and feature engineering
This module will introduce you to skills required for effective feature engineering in today's business enterprises. The skills are presented as a series of best practices representing years of practical experience.

WEEK 2
Pattern recognition and data mining best practices
This module will continue the discussion of skill related to feature engineering for practicing data scientists, with a focus on outliers and the use of unsupervised learning techniques for finding patterns.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Cloud Networking (Coursera) Coursera
University of Illinois at Urbana-Champaign

Cloud Networking (Coursera)

In the cloud networking course, we will see what the network needs to do to enable cloud computing. We will explore current practice by talking to leading industry experts, as well as looking into interesting new research that might shape the cloud network’s future. This course will allow us to explore in-depth the challenges for cloud networking—how do we build a network infrastructure that provides the agility to deploy virtual networks on a shared infrastructure, that enables both efficient transfer of big data and low latency communication, and that enables applications to be federated across countries and continents? Examining how these objectives are met will set the stage for the rest of the course.

Jun 8th 2026
5-12 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 8th 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks
Communicating Data Science Results (Coursera) Coursera
University of Washington

Communicating Data Science Results (Coursera)

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Jun 8th 2026
3 Weeks
The Structured Query Language (SQL) (Coursera) Coursera
University of Colorado Boulder

The Structured Query Language (SQL) (Coursera)

In this course you will learn all about the Structured Query Language ("SQL".) We will review the origins of the language and its conceptual foundations. But primarily, we will focus on learning all the standard SQL commands, their syntax, and how to use these commands to conduct analysis of the data within a relational database. Our scope includes not only the SELECT statement for retrieving data and creating analytical reports, but also includes the DDL ("Data Definition Language") and DML ("Data Manipulation Language") commands necessary to create and maintain database objects.

Jun 9th 2026
5-12 Weeks
Preparing for the Google Cloud Professional Data Engineer Exam (Coursera) Coursera
Google Cloud

Preparing for the Google Cloud Professional Data Engineer Exam (Coursera)

From the course: "The best way to prepare for the exam is to be competent in the skills required of the job." This course uses a top-down approach to recognize knowledge and skills already known, and to surface information and skill areas for additional preparation. You can use this course to help create your own custom preparation plan. It helps you distinguish what you know from what you don't know. And it helps you develop and practice skills required of practitioners who perform this job.

Jun 13th 2026
5-12 Weeks
Machine Learning Foundations: A Case Study Approach (Coursera) Coursera
University of Washington

Machine Learning Foundations: A Case Study Approach (Coursera)

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies.

Jun 8th 2026
5-12 Weeks
Advanced Algorithms and Complexity (Coursera) Coursera
University of California, San Diego,Higher School of Economics - HSE University

Advanced Algorithms and Complexity (Coursera)

You've learned the basic algorithms now and are ready to step into the area of more complex problems and algorithms to solve them. Advanced algorithms build upon basic ones and use new ideas. We will start with networks flows which are used in more typical applications such as optimal matchings, finding disjoint paths and flight scheduling as well as more surprising ones like image segmentation in computer vision.

Jun 8th 2026
5-12 Weeks
Generative AI Essentials: Overview and Impact (Coursera) Coursera
University of Michigan

Generative AI Essentials: Overview and Impact (Coursera)

With the rise of generative artificial intelligence, there has been a growing demand to explore how to use these powerful tools not only in our work but also in our day-to-day lives. Generative AI Essentials: Overview and Impact introduces learners to large language models and generative AI tools, like ChatGPT. In this course, you’ll explore generative AI essentials, how to ethically use artificial intelligence, its implications for authorship, and what regulations for generative AI could look like.

Jun 12th 2026
1 Week
Machine Learning With Big Data (Coursera) Coursera
University of California, San Diego

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems.

Jun 8th 2026
5-12 Weeks