Site Reliability Engineering (SRE) Professional Certificate

What you will learn:
- The core principles of Site Reliability Engineering.
- How to design automation strategies; perform operational readiness reviews; employ cost-optimization strategies; and manage backups and recoveries.
- Approaches for cloud monitoring; identifying key metrics and measuring service health.
- How to identify and manage incidents; develop action plans to mitigate future risk; and perform post incident reviews.
- The key concepts to monitor and manage security threats.
- How to troubleshoot common IBM Cloud issues.
- How to design and improve reliability for systems and cloud services and employ best practices to automate deployments.

Sort options

SRE Capstone (edX)

Self Paced
SRE Capstone (edX)
Course Auditing
Categories
Effort
Languages
The SRE Capstone offers interactive study guides and flash cards that will help you prepare for the Professional SRE - Cloud V2 certification exam. Also included are hands-on lab exercises that allow you to put the knowledge you gained from the SRE Fundamentals and Security and SRE Infrastructure, Resiliency [...]

SRE Infrastructure, Resiliency and Deployment Automation (edX)

Self Paced
SRE Infrastructure, Resiliency and Deployment Automation (edX)
Course Auditing
Categories
Effort
Languages
Discover the importance of reliability engineering and resiliency for services and how the deployment pipeline can be used to help with automation. Explore various infrastructure types, troubleshoot common service issues, including Kubernetes and Openshift clusters.

SRE Fundamentals and Security (edX)

Learn foundational principles and terminology needed to understand the new and growing discipline of Site Reliability Engineering. Explore operation strategies and best practices for monitoring and managing services health and security.