Sep 5th 2016

Reliable Distributed Algorithms, Part 1 (edX)

Created by:Delivered by:

This course gives a comprehensive introduction to the theory and practice of distributed algorithms for designing scalable, reliable services. This course is the first course in a series of two. Both courses provide a solid foundation in the area of reliable distributed computing, including the main concepts, results, models and algorithms in the field.

Today's global IT infrastructures are distributed systems; from the Internet to the data-centers of cloud computing that fuel the current revolution of global IT services. At the core of these services you find distributed algorithms.

These algorithms run on multiple computers and communicate only by sending and receiving messages. It is crucial for the implemented services to continue to work 24/7 even if some of the computers fail or some of the messages are lost in transit. This is the subject of reliable distributed algorithms in computer science.

ID2203.1x covers models of distributed algorithms based on input/output automata; specifications of fault tolerant abstractions and failure detectors; specific distributed abstractions and fault-tolerant algorithms, including reliable broadcast and causal broadcast; key-value stores and consistency models; single-value consensus and the Paxos algorithm.

To complete the course with a full grade (100%) students are required to answer the graded quizzes provided every week, as well as the programming assignments.

What you'll learn:

- Event-driven concurrent programming of distributed algorithms

- Formal models of asynchronous systems using input/output automata

- Failure detectors and equivalence between various distributed abstractions

- Specifications and algorithms for reliable and causal-order broadcast

- Distributed shared memory and consistency models

- Single value consensus and related consensus algorithms, including Paxos.