EdX

Data Engineering Basics for Everyone (edX)

Offered by IBM,
Data Engineering Basics for Everyone (edX)

Learn about data engineering concepts, ecosystem, and lifecycle. Also learn about the systems, processes, and tools you need as a Data Engineer in order to gather, transform, load, process, query, and manage data so that it can be leveraged by data consumers for operations, and decision-making.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Welcome to Data Engineering Basics. This course is designed to familiarize you with data engineering concepts, ecosystem, lifecycle, processes, and tools.
The Data Engineering Ecosystem includes several different components. It includes data, data repositories, data integration platforms, data pipelines, different types of languages, and BI and Reporting tools. Data pipelines gather raw data from disparate data sources. Data repositories, such as relational and non-relational databases, data warehouses, data marts, data lakes, and big data stores, store and process this data. Data Integration Platforms combine data into a unified view for secure and easy access by data consumers. Data consumers use BI, reporting, and analytical tools on data so they can glean insights for better decision-making. You will learn about each of these components in this course.
A typical Data Engineering lifecycle includes architecting data platforms and designing data stores. It also includes the process of gathering, importing, wrangling, cleaning, querying, and analyzing data. Systems and workflows need to be monitored and finetuned for performance at optimal levels. In this course, you will learn about the architecture of data platforms and things you need to consider in order to design and select the right data store for your needs. You will also learn about the processes and tools a data engineer employs in order to gather, import, wrangle, clean, query, and analyze data.
Through a series of hands-on labs, you will be guided to provision a data store on IBM cloud, prepare and load data into the data store, and perform some basic operations on data.
Data Engineering is recognized as one of the fastest-growing fields today. The career opportunities available, and the different paths you can take to become a data engineer, are discussed in the course. Seasoned data professionals advice you on the practical and day-to-day aspects of being a data engineer and the skills and qualities employers look for in a data engineer.
This course is part of the following programs:

What you'll learn
The objective of this course is to give you a solid understanding of what Data Engineering is.
In this course you will learn about:

Module 1: What is Data Engineering
Modern Data Ecosystem
Key Players in the Data Ecosystem
What is Data Engineering?
Responsibilities and Skillsets of a Data Engineer
A day in the life of a Data Engineer

Module 2: Data Engineering Ecosystem
Overview of the Data Engineering Ecosystem
Types of Data
Understanding different types of File Formats
Sources of Data
Languages for Data Professionals
Overview of Data Repositories
RDBMS
NoSQL
Data Warehouses, Data Marts, and Data Lakes
ETL, ELT, and Data Pipelines
Data Integration Platforms
Foundations of Big Data
Big Data processing tools: Hadoop, HDFS, Hive, and Spark

Module 3: Data Engineering Lifecycle
Architecting the Data Platform
Factors for Selecting and Designing Data Stores
Security
How to Gather and Import Data
Data Wrangling
Tools for Data Wrangling
Querying and Analyzing data
Performance Tuning and Troubleshooting
Governance and Compliance

Module 4: Career Opportunities and Learning Paths
Career Opportunities in Data Engineering
Data Engineering Learning Path

Syllabus

Module 1: What is Data Engineering
Module 2: Data Engineering Ecosystem
Module 3: Data Engineering Lifecycle
Module 4: Career Opportunities and Learning Paths

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Data Analytics Basics for Everyone (edX) EdX
IBM

Data Analytics Basics for Everyone (edX)

Learn the fundamentals of Data Analytics and gain an understanding of the data ecosystem, the process and lifecycle of data analytics, career opportunities, and the different learning paths you can take to be a Data Analyst. In this course, you will learn about the various components of a modern data ecosystem and the role Data Analysts, Data Scientists, and Data Engineers play in this ecosystem.

Self Paced
Self-Paced
Python Project for Data Engineering (Coursera) Coursera
IBM

Python Project for Data Engineering (Coursera)

This mini-course is intended to apply foundational Python skills by implementing different techniques to collect and work with data. Assume the role of a Data Engineer and extract data from multiple file formats, transform it into specific datatypes, and then load it into a single source for analysis. Continue with the course and test your knowledge by implementing webscraping and extracting data with APIs all with the help of multiple hands-on labs. After completing this course you will have acquired the confidence to begin collecting large datasets from multiple sources and transform them into one primary source, or begin web scraping to gain valuable business insights all with the use of Python.

Jun 8th 2026
1 Week
Advanced Data Engineering (Coursera) Coursera
Duke University

Advanced Data Engineering (Coursera)

In this advanced course, you will gain practical expertise in scaling data engineering systems using cutting-edge tools and techniques. This course is designed for data scientists, data engineers, and anyone with a foundational understanding of data handling who desires to escalate their skills to handle larger, more complex datasets efficiently.

Jun 15th 2026
4 Weeks
AI Skills for Engineers: Data Engineering and Data Pipelines (edX) EdX
Delft University of Technology,DelftX

AI Skills for Engineers: Data Engineering and Data Pipelines (edX)

Good data is central to effective AI applications. This course teaches the basics of data for AI, covering what data is needed, how to extract data from existing databases and basic data skills including setup of a Python notebook environment, basic data exploration and simple data visualizations.

Self Paced
Self-Paced
Python for Data Engineering Project (edX) EdX
IBM

Python for Data Engineering Project (edX)

An opportunity to apply your foundational Python skills via a project, using various techniques to collect and work with data. Journey into the realm of becoming a Data Engineer and apply your basic Python knowledge of working with data. You will exercise various techniques in Python to extract data in multiple file formats from different sources, transform it into specific datatypes, and then prepare it for loading it into a database.

Self Paced
Self-Paced
Introduction to SQL (edX) EdX
IBM

Introduction to SQL (edX)

Learn how to use and apply the powerful language of SQL to better communicate and extract data from databases - a must for anyone working in Data Engineering, Data Analytics or Data Science. Much of the world's data lives in databases. SQL (or Structured Query Language) is a powerful programming language that is used for communicating with and manipulating data in databases.

Self Paced
Self-Paced
Relational Database Basics (edX) EdX
IBM

Relational Database Basics (edX)

This course teaches you the fundamental concepts of relational databases and Relational Database Management Systems (RDBMS). This course is an introduction to the world of relational databases. You will explore the fundamental concepts of relational databases and Relational Database Management Systems (RDBMS), learn about relational database design, and understand how to transform source data into tables with clearly defined relationships.

Self Paced
Self-Paced