Johns Hopkins University

 

 


 

Customize your search:

E.g., 2017-03-28
E.g., 2017-03-28
E.g., 2017-03-28
Apr 3rd 2017

By now you have definitely heard about data science and big data. In this one-week class, we will provide a crash course in what these terms mean and how they play a role in successful organizations. This class is for anyone who wants to learn what all the data science action is about, including those who will eventually need to manage data scientists. The goal is to get you up to speed as quickly as possible on data science without all the fluff. We've designed this course to be as convenient as possible without sacrificing any of the essentials.

Average: 7.2 (13 votes)
Apr 3rd 2017

In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Average: 5.6 (24 votes)
Apr 3rd 2017

This course covers advanced topics in R programming that are necessary for developing powerful, robust, and reusable data science tools. Topics covered include functional programming in R, robust error handling, object oriented programming, profiling and benchmarking, debugging, and proper design of functions.

Average: 6.4 (5 votes)
Apr 3rd 2017

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Average: 6.1 (16 votes)
Apr 3rd 2017

This one-week course describes the process of analyzing data and how to manage that process. We describe the iterative nature of data analysis and the role of stating a sharp question, exploratory data analysis, inference, formal statistical modeling, interpretation, and communication. In addition, we will describe how to direct analytic activities within a team and to drive the data analysis process towards coherent and useful results.

Average: 6.3 (14 votes)
Apr 3rd 2017

Writing good code for data science is only part of the job. In order to maximizing the usefulness and reusability of data science software, code must be organized and distributed in a manner that adheres to community-based standards and provides a good user experience. This course covers the primary means by which R software is organized and distributed to others.

Average: 3 (1 vote)
Apr 3rd 2017

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance.

Average: 7.1 (11 votes)
Apr 3rd 2017

This course provides an introduction to systems thinking and systems models in public health. Problems in public health and health policy tend to be complex with many actors, institutions and risk factors involved. If an outcome depends on many interacting and adaptive parts and actors the outcome cannot be analyzed or predicted with traditional statistical methods. Systems thinking is a core skill in public health and helps health policymakers build programs and policies that are aware of and prepared for unintended consequences.

Average: 4 (3 votes)
Apr 3rd 2017

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary.

Average: 6.7 (3 votes)
Apr 3rd 2017

Learn to provide psychological first aid to people in an emergency by employing the RAPID model: Reflective listening, Assessment of needs, Prioritization, Intervention, and Disposition.

Average: 7.7 (18 votes)
Apr 3rd 2017

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Average: 7.2 (5 votes)
Apr 3rd 2017

The data science revolution has produced reams of new data from a wide variety of new sources. These new datasets are being used to answer new questions in way never before conceived. Visualization remains one of the most powerful ways draw conclusions from data, but the influx of new data types requires the development of new visualization techniques and building blocks.

Average: 7.7 (6 votes)
Apr 3rd 2017

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.

Average: 7.7 (7 votes)
Apr 3rd 2017

A data product is the production output from a statistical analysis. Data products automate complex analysis tasks or use technology to expand the utility of a data informed model, algorithm or inference. This course covers the basics of creating data products using Shiny, R packages, and interactive graphics. The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.

Average: 3.7 (13 votes)
Apr 3rd 2017

Synbio is a diverse field with diverse applications, and the different contexts (e.g., gain-of-function research, biofuels) raise different ethical and governance challenges. The objective of this course is to increase learners’ awareness and understanding of ethical and policy/governance issues that arise in the design, conduct and application of synthetic biology. The course will begin with a short history of recombinant DNA technology and how governance of that science developed and evolved, and progress through a series of areas of application of synbio.

No votes yet
Apr 3rd 2017

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.

Average: 5.9 (18 votes)
Apr 3rd 2017

Neurohacking describes how to use the R programming language and its associated package to perform manipulation, processing, and analysis of neuroimaging data. We focus on publicly-available structural magnetic resonance imaging (MRI). We discuss concepts such as inhomogeneity correction, image registration, and image visualization.

No votes yet
Apr 3rd 2017

Have you ever had the perfect data science experience? The data pull went perfectly. There were no merging errors or missing data. Hypotheses were clearly defined prior to analyses. Randomization was performed for the treatment of interest. The analytic plan was outlined prior to analysis and followed exactly. The conclusions were clear and actionable decisions were obvious. Has that every happened to you? Of course not. Data analysis in real life is messy. How does one manage a team facing real data analyses? In this one-week course, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.

Average: 6.9 (9 votes)
Apr 3rd 2017

This course provides a rigorous introduction to the R programming language, with a particular focus on using R for software development in a data science setting. Whether you are part of a data science team or working individually within a community of developers, this course will give you the knowledge of R needed to make useful contributions in those settings.

Average: 3.8 (6 votes)
Apr 3rd 2017

Data science is a team sport. As a data science executive it is your job to recruit, organize, and manage the team to success. In this one-week course, we will cover how you can find the right people to fill out your data science team, how to organize them to give them the best chance to feel empowered and successful, and how to manage your team as it grows.

Average: 8 (5 votes)

Pages