Learn about the role of data in a range of disciplines and about some fundamental tools for extracting knowledge from data. How can we answer questions about the world around us? How can we make decisions about what to do? Over the past years, more and more people have turned to data for help. Huge amounts of data are collected every day from millions of sources. This data has a lot to tell us! But data by itself is mute—it can only help us if we learn to make it speak and tell its story.
In this short free online course, we will introduce basic ideas about collecting data, and techniques for turning data into information we can use. Along the way, we will hear from researchers at Loughborough University about the ways they use data in their work.
Learn to answer questions with data
In the first week, we will start by considering some questions drawn from arts, political science, geography and sport that we want to answer. We will think about what sort of data we might be able to use to answer these questions, and how we might go about finding this data.
Once we have data, we will start to explore it using some visual tools we can either create by hand or using apps online. We will discuss how to understand these visualisations and begin to read what our data has to say.
In the second week, we will follow up with ways to summarise and present data. You will learn how to choose the right summary for the type of data you have collected and the question you are trying to answer.
We will conclude with an article about how to make meaningful comparisons using data, and an explanation of the critical concept of significance. We will look at the data we have collected and use these techniques to see what it has to say about our starting questions.
Throughout the course, we will be collecting, sharing, analysing and discussing our own data and learning what it has to say about some specific questions.
Improve your critical thinking skills
Although there exist very difficult and mathematically complicated methods of analysing data, the fundamentals of data analysis come from general critical thinking, and can be grasped with the basic examples and techniques we will cover. By the end of this course, you will have learned about how data can help answer questions in a variety of disciplines, and have hands-on experience with data collection and analysis.
This course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.
Learn why and how knowledge management and Big Data are vital to the new business era. The business landscape is changing so rapidly that traditional management, business and computing courses do not meet the needs for the next generation of workers in the business world. Most traditional methods are of a repetitive, rule-based nature and will be gradually replaced by Artificial Intelligence.
In business, data and algorithms create economic value when they reduce uncertainty about financially important outcomes. This course teaches the concepts and mathematical methods behind the most powerful and universal metrics used by Data Scientists to evaluate the uncertainty-reduction – or information gain - predictive models provide. We focus on the two most common types of predictive model - binary classification and linear regression - and you will learn metrics to quantify for yourself the exact reduction in uncertainty each can offer. These metrics are applicable to any form of model that uses new information to improve predictions cast in the form of a known probability distribution – the standard way of representing forecasts in data science.
In this course you will learn how to use survey weights to estimate descriptive statistics, like means and totals, and more complicated quantities like model parameters for linear and logistic regressions. Software capabilities will be covered with R® receiving particular emphasis.
Good data collection is built on good samples. But the samples can be chosen in many ways. Samples can be haphazard or convenient selections of persons, or records, or networks, or other units, but one questions the quality of such samples, especially what these selection methods mean for drawing good conclusions about a population after data collection and analysis is done. Samples can be more carefully selected based on a researcher’s judgment, but one then questions whether that judgment can be biased by personal factors.
Before you can work with data you have to get some. This course will cover the basic ways that data can be obtained. The course will cover obtaining data from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”. Tidy data dramatically speed downstream data analysis tasks. The course will also cover the components of a complete data set including raw data, processing instructions, codebooks, and processed data. The course will cover the basics needed for collecting, cleaning, and sharing data.
This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary.
This course is designed to impact the way you think about transforming data into better decisions. Recent extraordinary improvements in data-collecting technologies have changed the way firms make informed and effective business decisions. The course on operations analytics, taught by three of Wharton’s leading experts, focuses on how the data can be used to profitably match supply with demand in various business settings. In this course, you will learn how to model future demand uncertainties, how to predict the outcomes of competing policy choices and how to choose the best course of action in the face of risk. The course will introduce frameworks and ideas that provide insights into a spectrum of real-world business challenges, will teach you methods and software available for tackling these challenges quantitatively as well as the issues involved in gathering the relevant data.
This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.
Have you ever had the perfect data science experience? The data pull went perfectly. There were no merging errors or missing data. Hypotheses were clearly defined prior to analyses. Randomization was performed for the treatment of interest. The analytic plan was outlined prior to analysis and followed exactly. The conclusions were clear and actionable decisions were obvious. Has that every happened to you? Of course not. Data analysis in real life is messy. How does one manage a team facing real data analyses? In this one-week course, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.
MOOCs – Massive Open Online Courses – enable students around the world to take university courses online. This guide, by the instructors of edX’s most successful MOOC in 2013-2014, Principles of Written English (based on both enrollments and rate of completion), advises current and future students how to get the most out of their online study, covering areas such as what types of courses are offered and who offers them, what resources students need, how to register, how to work effectively with other students, how to interact with professors and staff, and how to handle assignments. This second edition offers a new chapter on how to stay motivated. This book is suitable for both native and non-native speakers of English, and is applicable to MOOC classes on any subject (and indeed, for just about any type of online study).