Data Science

 

 


 

Customize your search:

E.g., 2016-12-11
E.g., 2016-12-11
E.g., 2016-12-11
Feb 1st 2017

Digital images of earth’s surface produced by remote sensing are the basis of modern mapping. They are also used to create valuable information products across a spectrum of industries. This free online course is for everyone who is interested in applications of earth imagery to increase productivity, save money, protect the environment, and even save lives.

Average: 3.8 (5 votes)
Dec 12th 2016

In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Average: 5.6 (24 votes)
Dec 12th 2016

In this course, you will learn the fundamental techniques for making personalized recommendations through nearest-neighbor techniques. First you will learn user-user collaborative filtering, an algorithm that identifies other people with similar tastes to a target user and combines their ratings to make recommendations for that user.

Average: 6.5 (2 votes)
Dec 12th 2016

This course describes Bayesian statistics, in which one's inferences about parameters or hypotheses are updated as evidence accumulates. You will learn to use Bayes’ rule to transform prior probabilities into posterior probabilities, and be introduced to the underlying theory and perspective of the Bayesian paradigm.

No votes yet
Dec 12th 2016

Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.

Average: 5.4 (13 votes)
Dec 12th 2016

In business, data and algorithms create economic value when they reduce uncertainty about financially important outcomes. This course teaches the concepts and mathematical methods behind the most powerful and universal metrics used by Data Scientists to evaluate the uncertainty-reduction – or information gain - predictive models provide. We focus on the two most common types of predictive model - binary classification and linear regression - and you will learn metrics to quantify for yourself the exact reduction in uncertainty each can offer. These metrics are applicable to any form of model that uses new information to improve predictions cast in the form of a known probability distribution – the standard way of representing forecasts in data science.

Average: 10 (1 vote)
Dec 12th 2016

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference. A practitioner can often be left in a debilitating maze of techniques, philosophies and nuance.

Average: 7.1 (11 votes)
Dec 12th 2016

The abilities to understand and apply Business Statistics are becoming increasingly important in the industry. A good understanding of Business Statistics is a requirement to make correct and relevant interpretations of data. Lack of knowledge could lead to erroneous decisions which could potentially have negative consequences for a firm. This course is designed to introduce you to Business Statistics.

No votes yet
Dec 12th 2016

Discover the basic concepts of cluster analysis, and then study a set of typical clustering methodologies, algorithms, and applications. This includes partitioning methods such as k-means, hierarchical methods such as BIRCH, and density-based methods such as DBSCAN/OPTICS. Moreover, learn methods for clustering validation and evaluation of clustering quality. Finally, see examples of cluster analysis in applications.

Average: 5.5 (2 votes)
Dec 12th 2016

One of the skills that characterizes great business data analysts is the ability to communicate practical implications of quantitative analyses to any kind of audience member. Even the most sophisticated statistical analyses are not useful to a business if they do not lead to actionable advice, or if the answers to those business questions are not conveyed in a way that non-technical people can understand. In this course you will learn how to become a master at communicating business-relevant implications of data analyses.

Average: 6.7 (3 votes)
Dec 12th 2016

This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary.

Average: 6.7 (3 votes)
Dec 12th 2016

This course aims to help you to draw better statistical inferences from empirical research. First, we will discuss how to correctly interpret p-values, effect sizes, confidence intervals, Bayes Factors, and likelihood ratios, and how these statistics answer different questions you might be interested in. Then, you will learn how to design experiments where the false positive rate is controlled, and how to decide upon the sample size for your study, for example in order to achieve high statistical power.

Average: 10 (1 vote)
Dec 12th 2016

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.

Average: 7.2 (5 votes)
Dec 12th 2016

In this course, you will analyze and apply essential design principles to your Tableau visualizations. This course assumes you understand the tools within Tableau and have some knowledge of the fundamental concepts of data visualization.

No votes yet
Dec 12th 2016

Case Study - Predicting Housing Prices
In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets.

Average: 7.5 (4 votes)
Dec 12th 2016

This course provides a rigorous introduction to the R programming language, with a particular focus on using R for software development in a data science setting. Whether you are part of a data science team or working individually within a community of developers, this course will give you the knowledge of R needed to make useful contributions in those settings.

Average: 4 (5 votes)
Dec 12th 2016

Whether being used to customize advertising to millions of website visitors or streamline inventory ordering at a small restaurant, data is becoming more integral to success. Too often, we’re not sure how use data to find answers to the questions that will make us more successful in what we do. In this course, you will discover what data is and think about what questions you have that can be answered by the data – even if you’ve never thought about data before. Based on existing data, you will learn to develop a research question, describe the variables and their relationships, calculate basic statistics, and present your results clearly. By the end of the course, you will be able to use powerful data analysis tools – either SAS or Python – to manage and visualize your data, including how to deal with missing data, variable groups, and graphs. Throughout the course, you will share your progress with others to gain valuable feedback, while also learning how your peers use data to answer their own questions.

Average: 7.7 (7 votes)
Dec 12th 2016

Neurohacking describes how to use the R programming language and its associated package to perform manipulation, processing, and analysis of neuroimaging data. We focus on publicly-available structural magnetic resonance imaging (MRI). We discuss concepts such as inhomogeneity correction, image registration, and image visualization.

No votes yet
Dec 12th 2016

Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated. The course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.

Average: 7.7 (6 votes)
Dec 12th 2016

Writing good code for data science is only part of the job. In order to maximizing the usefulness and reusability of data science software, code must be organized and distributed in a manner that adheres to community-based standards and provides a good user experience. This course covers the primary means by which R software is organized and distributed to others.

Average: 3 (1 vote)

Pages