Data Pipelines

Sort options

Serverless Data Processing with Dataflow: Operations (edX)

Self Paced
Serverless Data Processing with Dataflow: Operations (edX)
Course Auditing
Categories
Effort
Languages
In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. In the last installment of the Dataflow course series, we will introduce the components of the Dataflow operational model. We will examine tools and techniques for troubleshooting and optimizing pipeline [...]

Serverless Data Processing with Dataflow: Develop Pipelines (edX)

Self Paced
Serverless Data Processing with Dataflow: Develop Pipelines (edX)
Course Auditing
Categories
Effort
Languages
In this second installment of the Dataflow course series, we are going to be diving deeper on developing pipelines using the Beam SDK. In this second installment of the Dataflow course series, we are going to be diving deeper on developing pipelines using the Beam SDK. We start with [...]

Smart Analytics, Machine Learning, and AI on Google Cloud (edX)

Self Paced
Smart Analytics, Machine Learning, and AI on Google Cloud (edX)
Course Auditing
Categories
Effort
Languages
This course covers several ways machine learning can be included in data pipelines on Google Cloud depending on the level of customization required. Incorporating machine learning into data pipelines increases the ability of businesses to extract insights from their data. This course covers several ways machine learning can be [...]

Building Resilient Streaming Analytics Systems on Google Cloud (edX)

Self Paced
Building Resilient Streaming Analytics Systems on Google Cloud (edX)
Course Auditing
Categories
Effort
Languages
This class is intended for data analysts, data scientists and programmers who want to build for out-of-the-ordinary scenarios such as high availability, resiliency, high-throughput, real-time streaming analytics on leveraging Google Cloud.

Building Batch Data Pipelines on Google Cloud (edX)

Self Paced
Building Batch Data Pipelines on Google Cloud (edX)
Course Auditing
Categories
Effort
Languages
Developers responsible for designing pipelines and architectures for data processing. Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data.

AI Skills for Engineers: Data Engineering and Data Pipelines (edX)

Good data is central to effective AI applications. This course teaches the basics of data for AI, covering what data is needed, how to extract data from existing databases and basic data skills including setup of a Python notebook environment, basic data exploration and simple data visualizations.

Building ETL and Data Pipelines with Bash, Airflow and Kafka (edX)

Self Paced
Building ETL and Data Pipelines with Bash, Airflow and Kafka (edX)
Course Auditing
Categories
Effort
Languages
This course provides you with practical skills to build and manage data pipelines and Extract, Transform, Load (ETL) processes using shell scripts, Airflow and Kafka. Well-designed and automated data pipelines and ETL processes are the foundation of a successful Business Intelligence platform. Defining your data workflows, pipelines and processes [...]

Data Engineer (Dataquest)

Self Paced
Data Engineer (Dataquest)
Free Course
Categories
Effort
Languages
Get all the skills and knowledge you need to become a data engineer. You’ll learn how to work with data architecture, data processing, and data systems. By the end, you’ll be able to build a unique data infrastructure, manage data pipelines and data processing, and maintain data systems.

Machine Learning Operations 2 (MLOps2-AML): Data Pipeline Automation & Optimization using Microsoft Azure Machine Learning (AML) (edX)

Most data science projects fail. There are various reasons why, but one of the primary reasons is the challenge of deployment. One piece to the deployment puzzle is understanding how to automate your pipeline’s functions and continuously optimize its performance, which is why we developed this course, MLOps2: Data [...]

Machine Learning Operations 2 (MLOps2-GCP): Data Pipeline Automation & Optimization using Google Cloud Platform (GCP) (edX)

Most data science projects fail. There are various reasons why, but one of the primary reasons is the challenge of deployment. One piece to the deployment puzzle is understanding how to automate your pipeline’s functions and continuously optimize its performance, which is why we developed this course, MLOp2s: Data [...]