Oct 24th 2016

Genome Sequencing (Bioinformatics II) (Coursera)

Biologists still cannot read the nucleotides of an entire genome as you would read a book from beginning to end. However, they can read short pieces of DNA. In this course, we will see how graph theory can be used to assemble genomes from these short pieces. We will further learn about brute force algorithms and apply them to sequencing mini-proteins called antibiotics. Finally, you will learn how to apply popular bioinformatics software tools to sequence the genome of a deadly Staphylococcus bacterium.

In "Finding Hidden Messages in DNA", we discussed how to separate some of the signal from the apparent noise of DNA sequences. But how do we know what the DNA sequence making up a genome is in the first place? After all, DNA nucleotides are far too small to view with a normal microscope, and biologists still do not possess technology that would read all the nucleotides of your genome from beginning to end.

In this course, you will learn how entire genomes are assembled from millions of short overlapping pieces of DNA. The scale of this problem (the human genome is 3 billion nucleotides long!) implies that computers must be involved. Yet the problem is even more complex than it may appear ... to solve it, we will need to travel back in time to meet three famous mathematicians, and learn about algorithms based on graph theory.

Later in the course, we will see that sequencing genomes is not the only task related to decoding biological macromolecules. Another difficult problem is sequencing antibiotics, short mini-proteins engineered by bacteria to fight each other. Even though antibiotics often contain fewer than 10 amino acids, sequencing them is a formidable challenge. Decoding the sequence of amino acids making up an antibiotic is an important biomedical problem, but the practical barriers to sequencing short antibiotics are often more substantial than barriers to assembling a genome with millions of nucleotides! To address this computational challenge, we will learn about brute force algorithms that often succeed in various bioinformatics applications.

Finally in this course, you will learn how to apply popular bioinformatics software tools to assemble a deadly Staphylococcus bacterium. You will also be introduced to the popular cloud service BaseSpace offered by Illumina, the leading DNA sequencing company, thus joining the thousands of biologists and bioinformaticians who use BaseSpace every day.

Genome Sequencing (Bioinformatics II) is course 2 of 7 in the Bioinformatics Specialization..

How do we sequence and compare genomes? How do we identify the genetic basis for disease? When you complete this Specialization, you will learn how to answer many questions such as these in modern biology. In the process, you wlll learn about the algorithms and software tools that thousands of biologists apply at work every day in one of the fastest growing fields in science. Please learn more about the Bioinformatics Specialization (including why we are wearing these crazy outfits) by watching our introductory video. You can purchase the Specialization's printed companion, Bioinformatics Algorithms: An Active Learning Approach, from the textbook website. This Specialization also features an "Honors Track" (called "hacker track" in previous runs of the course). The Honors Track allows you to get your hands dirty by implementing the bioinformatics algorithms that you encounter along the way in a series of dozens of code challenges. By completing the Honors Track, you will be a true bioinformatics software professional! :)