May 23rd 2016

Genomic Data Science and Clustering (Bioinformatics V) (Coursera)

How do we infer which genes orchestrate various processes in the cell? How did humans migrate out of Africa and spread around the world? In this class, we will see that these two seemingly different questions can be addressed using similar algorithmic and machine learning techniques arising from the general problem of dividing data points into distinct clusters.

One of the first organisms to be domesticated by humans was yeast. Saccharomyces yeast is remarkable because it can not only convert the glucose in grapes into ethanol (which we then consume as wine), but it can also invert its own metabolism, consuming the ethanol it just produced in a process called the diauxic shift. To find genes implicated in the diauxic shift, we will learn about clustering algorithms that will divide yeast genes into distinct groups based on their patterns of regulatory behavior. A similar method can be applied to distinguish normal and tumor cells, an approach that led to diagnostic tests like MammaPrint for predicting the return of cancer after chemotherapy.

We can also apply clustering algorithms to identify the genetic foundation of human population structure and discover which populations have contributed to your own genome. To do so, we will need to power up clustering algorithms using a powerful computational approach called principal component analysis.

In the end of the course, a Bioinformatics Application Challenge will let you apply real bioinformatics software to cluster a biological Big Data.

Part of the href="*GqSdLGGurk&subid=&offerid=388822.1&type=10&tmpid=18061&">Bioinformatics: Journey to the Frontier of Computational Biology Specialization.