MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
Site Reliability Engineers must have the right tools and strategies to perform in a fast-paced technical environment. Nine competency areas guide the successful practice of IBM Cloud SREs.
- Applying Site Reliability Engineering principles
- Operations
- Monitoring and incident management
- Security and compliance
- Compute infrastructure
- Networking
- Storage and data management
- Reliability and resiliency
- Deployment automation
In this second course of the three-part Professional Certificate in Site Reliability Engineering (SRE), you will focus on the following five SRE competencies:
- Compute infrastructure
- Networking
- Storage and data management
- Reliability and resiliency
- Deployment automation
NOTE: The remaining four SRE competencies are covered in Course 1: SRE Fundamentals and Security.
This course is part of the Site Reliability Engineering (SRE) Professional Certificate.
What you'll learn
Compute infrastructure
- Troubleshoot VMs, IBM Kubernetes Service (IKS), Red Hat OpenShift and serverless services on IBM Cloud
- Configure for high availability and scalability
- Explain the impact of compute on service performance
Networking
- Troubleshoot external connections to IBM Cloud
- Troubleshoot inter service connectivity on IBM Cloud
- Explain the reliability ramifications of IBM Cloud networking features
- Explain the impact of networking on service performance
Storage and data management
- Manage storage and data attributes
- Manage data replication and retention
- Explain the impact of storage on service performance
- Monitor data security and compliance
- Identify storage data durability and capacity management
Reliability and resiliency
- Design and improve reliability for the system/service
- Design for failure and recovering from failure
Deployment automation
- Design non-disruptive deployment
- Troubleshoot provisioning of IBM Cloud resources
- Implement Infrastructure as Code
- Explain the responsibilities of the SRE to the CI/CD Pipelines
- Troubleshoot CI/CD pipelines
Syllabus
Module 1: Compute Infrastructure
You will cover the following topics:
- IBM Cloud service models: IaaS, PaaS, and FaaS
- Troubleshooting VMs on IBM Cloud
- Troubleshooting clusters on IBM Kubernetes Service
- Troubleshooting clusters on Red Hat OpenShift on IBM Cloud
- Troubleshooting serverless services
Module 2: Networking
You will cover the following topics:
- Applying IBM Cloud networking features
- Implementing and managing virtual networks on IBM Cloud
- Configuring name resolution on IBM Cloud
- Managing performance on IBM Cloud
- Troubleshooting external connections on IBM Cloud
- Troubleshooting interservice connectivity on IBM Cloud
Module 3: Storage and data management
You will cover the following topics:
- Managing storage and data attributes
- Managing storage accounts
- Managing data on IBM Cloud
- Managing data replication and retention
Module 4: Reliability and resiliency
You will cover the following topics:
- Importance of reliability and resiliency for services
- Designing and improving Reliability for systems and services
- Designing for failure and recovering from failure
Module 5: Deployment automation
You will cover the following topics:
- Deployment automation
- Implement Infrastructure as Code
- SRE responsibilities to CI/CD pipeline
Prerequisites:
At least 1 year experience in SRE or technology.
Understanding of:
- DevOps practices
- Software engineering principles
- System administration
- Network and OSI model
- Incident management
- Root cause analysis
Recommended courses:
- Introduction to Cloud Computing
- IBM Cloud Essentials
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.