SRE Fundamentals and Security (edX)

SRE Fundamentals and Security (edX)
Course Auditing
Categories
Effort
Certification
Languages
Misc

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

SRE Fundamentals and Security (edX)
Learn foundational principles and terminology needed to understand the new and growing discipline of Site Reliability Engineering. Explore operation strategies and best practices for monitoring and managing services health and security.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Site Reliability Engineers must have the right tools and strategies to perform in a technical, fast-paced environment. IBM Cloud SRE is guided by nine competency areas that lead to the successful practice of the discipline:

- Applying Site Reliability Engineering principles

- Operations

- Monitoring and incident management

- Security and compliance

- Compute infrastructure

- Networking

- Storage and data management

- Reliability and resiliency

- Deployment automation

In this first course of the three-part Professional Certificate in Site Reliability Engineering (SRE), you will focus on the first four SRE competencies:

- Applying Site Reliability Engineering principles

- Operations

- Monitoring and incident management

- Security and compliance

This course is part of the Site Reliability Engineering (SRE) Professional Certificate.


Prerequisites:

At least 1 year experience in SRE or technology.

Understanding of:

- DevOps practices

- Software engineering principles

- System administration

- Network and OSI model

- Incident management

- Root cause analysis

Recommended courses:

- Introduction to Cloud Computing

- IBM Cloud Essentials


What you'll learn

Applying Site Reliability Engineering principles

- Manage the trade-off between change, velocity, and reliability of services

- Negotiate service level objectives, service level indicators, and error budgets

- Design and deploy automation strategies

- Leverage IBM Cloud tools and technology across the software development life cycle

- Understand the roles and responsibilities for SRE effectiveness

Operations

- Monitor resource utilization

- Perform operational readiness review (ORR)

- Employ cost-optimization strategies

- Identify key metrics for service health

Monitoring and incident management

- Create and maintain metrics, traces, and alerts

- Collect, analyze, and manage logs on IBM Cloud

- Manage incidents

- Perform post incident review

- Recognize and differentiate performance and availability metrics

- Perform statistical analysis and create actionable outcomes

Security and compliance

- Monitor security threats

- Implement and manage security policies

- Implement encryption models

- Manage role-based access control (RBAC) on IBM Cloud

- Define the shared responsibility model


Syllabus


Module 1: Welcome and Introduction

You will cover the following topics:

- An introduction to the IBM Professional SRE role
Module 2: SRE Fundamentals and Terminology

You will cover the following topics:

- Deeper dive into SRE role

- SRE principles

- Managing trade-offs between change, velocity, and reliability

- Negotiating service level objectives, service level indicators, error budgets and the user experience

- IBM Cloud tools and technology across the Software Development Life Cycle

- Applying software engineering principles to drive reliability
Module 3: Operations

You will cover the following topics:

- Performing operational readiness reviews (ORR) on IBM Cloud

- Creating ORR checklist

- Employing cost-optimization strategies

- Managing backups and recoveries on IBM Cloud
Module 4: Monitoring

You will cover the following topics:

- Monitoring overview

- Creating and maintaining metrics, traces, and alerts on IBM Cloud

- Collecting, analyzing, and managing logs on IBM Cloud

- Identifying key metrics for service health on IBM Cloud

- Using performance and availability metrics to measure the health of services on IBM Cloud
Module 5: Incident Management

You will cover the following topics:

- Managing incidents on IBM Cloud

- Developing a balanced action plan to mitigate future incidents

- Performing the post-incident review
Module 6: Security and Compliance

You will cover the following topics:

- Monitoring and managing security threats on IBM Cloud

- Implementing and managing security policies on IBM Cloud

- Implementing encryption models

- Managing role-based access control on IBM Cloud



MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Course Auditing
90.00 EUR

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.