Advanced Data Engineering (Coursera)

Offered by Duke University,
Advanced Data Engineering (Coursera)

In this advanced course, you will gain practical expertise in scaling data engineering systems using cutting-edge tools and techniques. This course is designed for data scientists, data engineers, and anyone with a foundational understanding of data handling who desires to escalate their skills to handle larger, more complex datasets efficiently.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Throughout the course, you'll master the application of technologies such as Celery with RabbitMQ for scalable data consumption, Apache Airflow for optimized workflow management, and Vector and Graph databases for robust data management at scale.
The course will culminate with hands-on projects that offer real-world experience, where you'll put your acquired skills to test in solving data engineering challenges. You will not only learn to create scalable data systems but also to analyze their performance and make necessary adjustments for optimum results.
This invaluable experience in advanced data engineering techniques will prepare you for the demanding tasks of handling massive datasets, streamlining complex workflows, and optimizing data operations for businesses of any scale.
This course is part of the Large Language Model Operations (LLMOps) Specialization.

What you'll learn

  • Create and manage data pipelines and their lifecycle
  • Connect and work with message queues to manage data processing
  • Use vector, graph, and key/value databases for data storage at scale

Syllabus

Queues and Databases-RabbitMQ and MySQL
This week you will learn about databases and queues. You will find out the purpose and components of RabbitMQ including its use of queues and integration with Celery. Through hands-on exercises, they will gain experience connecting Celery to RabbitMQ within a Flask application and implementing task patterns like fire and forget and result retrieval. The course also covers core MySQL skills like interacting via the command line interface, manipulating databases, and integrating with Python web apps. By the end, students will have a foundational understanding of RabbitMQ, Celery, and MySQL that allows them to start building modern, asynchronous applications backed by a database.

Optimizing Workflow Management at Scale with Apache Airflow

Achieving Scalability with Vector, Graph, and Key/Value Databases
This week we explore vector and graph databases, powerful tools for managing and extracting insights from large, complex datasets. As data volumes continue to grow, scalability is crucial. We'll learn how vector and graph databases can efficiently store data while maintaining relationships, enabling more advanced analytics. Through real-world examples, you'll see how these databases unlock scalability for machine learning, fraud detection, social networks, and more.

Real-world Advanced Data Engineering Projects
In this final week, you will work on advanced real-world data engineering projects, applying everything you've learned. You'll encounter complex data challenges and devise solutions using the latest tools and techniques. This is an opportunity to bring together data engineering concepts covered throughout the course and implement them holistically to deliver impactful outcomes.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Foundations for Big Data Analysis with SQL (Coursera) Coursera
Cloudera

Foundations for Big Data Analysis with SQL (Coursera)

In this course, you'll get a big-picture view of using SQL for big data, starting with an overview of data, database systems, and the common querying language (SQL). Then you'll learn the characteristics of big data and SQL tools for working on big data platforms. You'll also install an exercise environment (virtual machine) to be used through the specialization courses, and you'll have an opportunity to do some initial exploration of databases and tables in that environment.

Jun 22nd 2026
5-12 Weeks
Databases and SQL for Data Science with Python(Coursera) Coursera
IBM

Databases and SQL for Data Science with Python(Coursera)

Much of the world's data resides in databases. SQL (or Structured Query Language) is a powerful language which is used for communicating with and extracting data from databases. A working knowledge of databases and SQL is a must if you want to become a data scientist. The purpose of this course is to introduce relational database concepts and help you learn and apply foundational knowledge of the SQL language. It is also intended to get you started with performing SQL access in a data science environment.

Jun 22nd 2026
4 Weeks
Building Database Applications in PHP (Coursera) Coursera
University of Michigan

Building Database Applications in PHP (Coursera)

In this course, we'll look at the object oriented patterns available in PHP. You'll learn how to connect to a MySQL using the Portable Data Objects (PDO) library and issue SQL commands in the the PHP language. We'll also look at how PHP uses cookies and manages session data. You'll learn how PHP avoids double posting data, how flash messages are implemented, and how to use a session to log in users in web applications.

Jun 22nd 2026
5-12 Weeks
Introdução à Ciência e Engenharia de Dados (Coursera) Coursera
FIA Business School

Introdução à Ciência e Engenharia de Dados (Coursera)

Neste curso, você aprenderá que os dados se tornaram o principal ativo de negócios nos dias de hoje. Com o aumento do Big Data e criação de novas tecnologias, as organizações em todo o mundo estão inovando e descobrindo novas formas para analisar o potencial dos dados à sua disposição, o que ajuda no crescimento, na lucratividade, no direcionamento das operações gerais e no aumento da satisfação do cliente. Mas para que tudo isso funcione corretamente e seja possível extrair todo o potencial de forma precisa e que seja viável para o negócio, criou-se a área de ciência de dados.

Jun 22nd 2026
4 Weeks
Python Scripting: Files, Inheritance, and Databases (Coursera) Coursera
LearnQuest

Python Scripting: Files, Inheritance, and Databases (Coursera)

This course is the third course in a series that aims to prepare you for a role working as a programmer. In this course, you will be introduced to the three main concepts in programming: Files, Inheritance and external libaries. Labs will allow the students to apply the material in the lectures in simple computer programs designed to re-enforce the material in the lesson.

Jun 22nd 2026
4 Weeks
Plant Bioinformatics Capstone (Coursera) Coursera
University of Toronto

Plant Bioinformatics Capstone (Coursera)

The past 15 years have been exciting ones in plant biology. Hundreds of plant genomes have been sequenced, RNA-seq has enabled transcriptome-wide expression profiling, and a proliferation of "-seq"-based methods has permitted protein-protein and protein-DNA interactions to be determined cheaply and in a high-throughput manner. These data sets in turn allow us to generate hypotheses at the click of a mouse or tap of a finger. In Plant Bioinformatics on Coursera.org, we covered 33 plant-specific online tools from genome browsers to transcriptomic data mining to promoter/network analyses and others, and in this Plant Bioinformatics Capstone we'll use these tools to hypothesize a biological role for a gene of unknown function, summarized in a written lab report.

Jun 22nd 2026
5-12 Weeks
Healthcare Data Models (Coursera) Coursera
University of California, Davis

Healthcare Data Models (Coursera)

Career prospects are bright for those qualified to work in healthcare data analytics. Perhaps you work in data analytics, but are considering a move into healthcare where your work can improve people’s quality of life. If so, this course gives you a glimpse into why this work matters, what you’d be doing in this role, and what takes place on the Path to Value where data is gathered from patients at the point of care, moves into data warehouses to be prepared for analysis, then moves along the data pipeline to be transformed into valuable insights that can save lives, reduce costs, to improve healthcare and make it more accessible and affordable.

Jun 22nd 2026
4 Weeks
Oracle SQL Proficiency (Coursera) Coursera
LearnQuest

Oracle SQL Proficiency (Coursera)

This course is designed to help you continue learning about Oracle SQL and database management. We will look more closely at the Create, Alter, and Update commands, explore database relationships, and demonstrate how to use database views and SQL functions. It is recommended that you complete the first three courses of this specialization prior to this one.

Jun 22nd 2026
2 Weeks