EdX

String Processing and Pattern Matching Algorithms (edX)

String Processing and Pattern Matching Algorithms (edX)

Learn about pattern matching and string processing algorithms and how they apply to interesting applications. The world and internet are full of textual information. We search for information using textual queries and read websites, books and e-mails.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

These are all strings from a computer science point of view. To make sense of all this information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.
In this course, part of the Algorithms and Data Structures MicroMasters program, you will learn about:

  • suffix trees;
  • suffix arrays;
  • how other brilliant algorithmic ideas help doctors to find differences between genomes;
  • power lightning-fast Internet searches.

What you'll learn

  • Key ideas for pattern matching and suffix trees
  • Suffix arrays

-Burrows-Wheeler Transform for compression

  • Applications of string algorithms in bioinformatics

Prerequisites
Basic knowledge of at least one programming language. Successful completion of the Algorithmic Toolbox and Data Structures courses.

Course Syllabus

Weeks 1 and 2: Suffix Trees
How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will - through a series of trials and errors - bring us to suffix trees.

Week 3 and 4: Burrows-Wheeler Transform and Suffix Arrays
Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Introduction to Java Programming: Starting to code in Java (edX) EdX
Universidad Carlos III de Madrid - UC3M,UC3Mx

Introduction to Java Programming: Starting to code in Java (edX)

Learn to program with Java in an easy and interactive way! In this introductory Java programming course, you will be introduced to powerful concepts such as functional abstraction, the object oriented programming (OOP) paradigm and Application Programming Interfaces (APIs). Examples and case studies will be provided so that you can implement simple programs on your own or collaborate with peers.

Self Paced
Self-Paced
Laboratorio di Programmazione (edX) EdX
University of Naples Federico II,FedericaX

Laboratorio di Programmazione (edX)

Impara a risolvere problemi complessi attraverso l'uso del computer e avvicinati alla magia degli algoritmi. Il linguaggio di programmazione è uno degli strumenti che abbiamo per interpretare e risolvere i problemi di tutti i giorni. Un linguaggio che è alla base di problemi comuni, come le previsioni del tempo o l'analisi della deformazione di una struttura di un'auto in un incidente stradale.

Self Paced
Self-Paced
Distributed Machine Learning with Apache Spark (edX) EdX
University of California, Berkeley,BerkeleyX

Distributed Machine Learning with Apache Spark (edX)

Learn the underlying principles required to develop scalable machine learning pipelines and gain hands-on experience using Apache Spark. Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability and optimization.

No sessions available
4 Weeks
Introduction to Object-Oriented Programming with Java II: Object-Oriented Programming and Algorithms (edX) EdX
Georgia Institute of Technology,GTx

Introduction to Object-Oriented Programming with Java II: Object-Oriented Programming and Algorithms (edX)

Learn the basics of object-oriented programming and algorithms. Students will build on the skills learned from “Introduction to Object-Oriented Programming with Java I: Foundations and Syntax Basics” and learn the basics of writing classes that serve as blueprints of concepts or objects that are represented in a programming problem. Students will leverage the concepts of inheritance, interfaces, and polymorphism to program reusability and flexibility in classes. Finally, students will gain experience walking through and analyzing algorithms that are applied on data (including objects) in many object-oriented programs.

Self Paced
Self-Paced
Algorithms and Data Structures Capstone (edX) EdX
University of California, San Diego,UC San DiegoX

Algorithms and Data Structures Capstone (edX)

Synthesize your knowledge of algorithms and biology to build your own software for solving a biological challenge. Building a fully-fledged algorithm to assemble genomes from DNA fragments on a real dataset is an enormous challenge with major demand in the multi-billion dollar biotech industry. In this capstone project, we will take the training wheels off and let you design your own optimized software program for genome sequencing.

Self Paced
Self-Paced
Aplicaciones de la Teoría de Grafos a la vida real II (edX) EdX
Universitat Politècnica de València,UPValenciaX

Aplicaciones de la Teoría de Grafos a la vida real II (edX)

Aprenderemos a modelizar problemas del mundo real mediante su representación con grafos y a resolverlos mediante sus algoritmos asociados. Este curso trata la Teoría de Grafos desde el punto de vista de la modelización, lo que nos permitirá con posterioridad resolver muchos problemas de diversa índole. Presentaremos ejemplos de los distintos problemas en un contexto real, analizaremos la representación de éstos mediante grafos y veremos los algoritmos necesarios para resolverlos.

Self Paced
Self-Paced
Introduction to Java Programming: Fundamental Data Structures and Algorithms (edX) EdX
Universidad Carlos III de Madrid - UC3M,UC3Mx

Introduction to Java Programming: Fundamental Data Structures and Algorithms (edX)

Learn to enhance your code by using fundamental data structures and powerful algorithms in Java. In this introductory course, you will learn programming with Java in an easy and interactive way. You will learn about fundamental data structures, such as lists, stacks, queues and trees, and presents algorithms for inserting, deleting, searching and sorting information on these data structures in an efficient way.

Self Paced
Self-Paced
Autonomous Mobile Robots (edX) EdX
ETH Zurich,ETHx

Autonomous Mobile Robots (edX)

Basic concepts and algorithms for locomotion, perception, and intelligent navigation. Robots are rapidly evolving from factory workhorses, which are physically bound to their work-cells, to increasingly complex machines capable of performing challenging tasks in our daily environment. The objective of this course is to provide the basic concepts and algorithms required to develop mobile robots that act autonomously in complex environments.

Self Paced
Self-Paced