Want to learn the basics of large-scale data processing? Need to make predictive models but don’t know the right tools? This course will introduce you to open source tools you can use for parallel, distributed and scalable machine learning.
This course distills for you expert knowledge and skills mastered by professionals in Health Big Data Science and Bioinformatics. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine that will be intertwined with the science of Big Data and skills to harness the avalanche of data openly available at your fingertips and which we are just starting to make sense of.
We’ll investigate the different steps required to master Big Data analytics on real datasets, including Next Generation Sequencing data, in a healthcare and biological context, from preparing data for analysis to completing the analysis, interpreting the results, visualizing them, and sharing the results.
Needless to say, when you master these high-demand skills, you will be well positioned to apply for or move to positions in biomedical data analytics and bioinformatics. No matter what your skill levels are in biomedical or technical areas, you will gain highly valuable new or sharpened skills that will make you stand-out as a professional and want to dive even deeper in biomedical Big Data. It is my hope that this course will spark your interest in the vast possibilities offered by publicly available Big Data to better understand, prevent, and treat diseases.
Who is this class for:
This course is primarily aimed at health care professionals or assistants, and those with a BS/MA/MS in science or technology or equivalent professional experience. Minimum technical skills are a good understanding of using an Excel spreadsheet. Additional prerequisite knowledge in basic statistics would be preferred, however additional resources will be made available to learners to acquire this knowledge. I think that anyone interested in getting insights into how to harness Big Data to better understand, prevent, and treat diseases can take this course because the material can be applied at different levels of expertise.
Genes and Data
After this module, you will be able to 1. Locate and download files for data analysis involving genes and medicine. 2. Open files and preprocess data using R language. 3. Write R scripts to replace missing values, normalize data, discretize data, and sample data.
Graded: Module 1 Quiz
Graded: Module 1 cBioPortal Data Analytics
Preparing Datasets for Analysis
After this module, you will be able to: 1. Locate and download files for data analysis involving genes and medicine. 2. Open files and preprocess data using R language. 3. Write R scripts to replace missing values, normalize data, discretize data, and sample data.
Graded: Module 2 Quiz
Graded: Module 2 R Data Preprocessing
Finding Differentially Expressed Genes
After this module, you will be able to 1. Select features from highly dimensional datasets. 2. Evaluate the performance of feature selection methods. 3. Write R scripts to select features from datasets involving gene expressions.
Graded: Module 3 Quiz
Graded: Module 3 R Finding Differentially Expressed Genes
Predicting Diseases from Genes
After this module, you will be able to 1. Build classification and prediction models. 2. Evaluate the performance of classification and prediction methods. 3. Write R scripts to classify and predict diseases from gene expressions.
Graded: Module 4 Quiz
Graded: Module 4 R Predicting Diseases from Genes
Determining Gene Alterations
After this module, you will be able to 1. List different types of gene alterations. 2. Compare and contrast methods for detecting gene mutations. 3. Compare and contrast methods for detecting methylation. 4. Compare and contrast methods for detecting copy number variations. 5. Quantify genomic alterations. 6. Connect genomic alterations to differential expression of genes. 7. Write programs in R for determining gene alterations and their relationship with gene expression.
Graded: Module 5 Quiz
Graded: Module 5 R Gene Alterations
Clustering and Pathway Analysis
After this module, you will be able to 1. Find clusters in biomedical data involving genes.2. Analyze and visualize biological pathways. 3. Write R scripts for clustering and for pathway analysis.
Graded: Module 6 Quiz
Graded: Module 6 R Clustering and Pathways