Combining and Analyzing Complex Data (Coursera)

Combining and Analyzing Complex Data (Coursera)

In this course you will learn how to use survey weights to estimate descriptive statistics, like means and totals, and more complicated quantities like model parameters for linear and logistic regressions. Software capabilities will be covered with R® receiving particular emphasis. The course will also cover the basics of record linkage and statistical matching—both of which are becoming more important as ways of combining data from different sources. Combining of datasets raises ethical issues which the course reviews. Informed consent may have to be obtained from persons to allow their data to be linked. You will learn about differences in the legal requirements in different countries.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Course 6 of 7 in the Survey Data Collection and Analytics Specialization.

Syllabus

WEEK 1
Basic Estimation
After completing Modules 1 and 2 of this course you will understand how to estimate descriptive statistics, overall and for subgroups, when you deal with survey data. We will review software for estimation (R, Stata, SAS) with examples for how to estimate things like means, proportions, and totals. You will also learn how to estimate parameters in linear, logistic, and other models and learn software options with emphasis on R. Module 3 and 4 discuss how you can add additional data to your analysis. This requires knowing about record linkage techniques, and what it takes to get permission to link data.

WEEK 2
Models
Module 2 covers how to estimate linear and logistic model parameters using survey data. After completing this module, you will understand how the methods used differ from the ones for non-survey data. We also cover the features of survey data sets that need to be accounted for when estimating standard errors of estimated model parameters.

WEEK 3
Record Linkage
Module starts with the current debate on using more (linked) administrative records in the U.S. Federal Statistical System, and a general motivation for linking records. Several examples will be given on why it is useful to link data. Challenges of record linkage will be discussed. A brief overview over key linkage techniques is included as well.

WEEK 4
Ethics
This module will discuss key issues in obtaining consent to record linkage. Failure to consent can lead to bias estimates. Current research examples will be given as well as practical suggestions on how to obtain linkage consent.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Practical Predictive Analytics: Models and Methods (Coursera) Coursera
University of Washington

Practical Predictive Analytics: Models and Methods (Coursera)

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Jun 8th 2026
4 Weeks
Exploratory Data Analysis (Coursera) Coursera
Johns Hopkins University

Exploratory Data Analysis (Coursera)

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data.

Jun 8th 2026
4 Weeks
Exploring and Producing Data for Business Decision Making (Coursera) Coursera
University of Illinois at Urbana-Champaign

Exploring and Producing Data for Business Decision Making (Coursera)

This course provides an analytical framework to help you evaluate key problems in a structured fashion and will equip you with tools to better manage the uncertainties that pervade and complicate business processes. Specifically, you will be introduced to statistics and how to summarize data and learn concepts of frequency, normal distribution, statistical studies, sampling, and confidence intervals.

Jun 8th 2026
4 Weeks
Effective Problem-Solving and Decision-Making (Coursera) Coursera
University of California, Irvine

Effective Problem-Solving and Decision-Making (Coursera)

Critical thinking – the application of scientific methods and logical reasoning to problems and decisions – is the foundation of effective problem solving and decision making. Critical thinking enables us to avoid common obstacles, test our beliefs and assumptions, and correct distortions in our thought processes. Gain confidence in assessing problems accurately, evaluating alternative solutions, and anticipating likely risks. Learn how to use analysis, synthesis, and positive inquiry to address individual and organizational problems and develop the critical thinking skills needed in today’s turbulent times. Using case studies and situations encountered by class members, explore successful models and proven methods that are readily transferable on-the-job.

Jun 8th 2026
4 Weeks
The Data Scientist's Toolbox (Coursera) Coursera
Johns Hopkins University

The Data Scientist's Toolbox (Coursera)

In this course you will get an introduction to the main tools and ideas in the data scientist's toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course. The first is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.

Jun 8th 2026
4 Weeks
Marketing Analytics (Coursera) Coursera
University of Virginia

Marketing Analytics (Coursera)

Organizations large and small are inundated with data about consumer choices. But that wealth of information does not always translate into better decisions. Knowing how to interpret data is the challenge -- and marketers in particular are increasingly expected to use analytics to inform and justify their decisions. Marketing analytics enables marketers to measure, manage and analyze marketing performance to maximize its effectiveness and optimize return on investment (ROI). Beyond the obvious sales and lead generation applications, marketing analytics can offer profound insights into customer preferences and trends, which can be further utilized for future marketing and business decisions.

Jun 8th 2026
5-12 Weeks
Leadership Through Marketing (Coursera) Coursera
Northwestern University

Leadership Through Marketing (Coursera)

The success of every organization depends on attracting and retaining customers. Although the marketing concepts for doing so are well established, digital technology has empowered customers, while producing massive amounts of data, revolutionizing the processes through which organizations attract and retain customers. In this course, students will learn how to identify new opportunities to create value for empowered consumers, develop strategies that yield an advantage over rivals, and develop the data science skills to lead more effectively, allocate resources, and to confront this very challenging environment with confidence.

Jun 14th 2026
4 Weeks
Decision-Making and Scenarios (Coursera) Coursera
University of Pennsylvania

Decision-Making and Scenarios (Coursera)

This course is designed to show you how use quantitative models to transform data into better business decisions. You’ll learn both how to use models to facilitate decision-making and also how to structure decision-making for optimum results. Two of Wharton’s most acclaimed professors will show you the step-by-step processes of modeling common business and financial scenarios, so you can significantly improve your ability to structure complex problems and derive useful insights about alternatives. Once you’ve created models of existing realities, possible risks, and alternative scenarios, you can determine the best solution for your business or enterprise, using the decision-making tools and techniques you’ve learned in this course.

Jun 8th 2026
4 Weeks
Data Manipulation at Scale: Systems and Algorithms (Coursera) Coursera
University of Washington

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

Jun 8th 2026
4 Weeks
Introducción a Data Science: Programación Estadística con R (Coursera) Coursera
Universidad Nacional Autónoma de México

Introducción a Data Science: Programación Estadística con R (Coursera)

Este curso te proporcionará las bases del lenguaje de programación estadística R, la lengua franca de la estadística, el cual te permitirá escribir programas que lean, manipulen y analicen datos cuantitativos. Te explicaremos la instalación del lenguaje; también verás una introducción a los sistemas base de gráficos y al paquete para graficar ggplot2, para visualizar estos datos. Además también abordarás la utilización de uno de los IDEs más populares entre la comunidad de usuarios de R, llamado RStudio.

Jun 8th 2026
4 Weeks
Statistical Inference (Coursera) Coursera
Johns Hopkins University

Statistical Inference (Coursera)

Statistical inference is the process of drawing conclusions about populations or scientific truths from data. There are many modes of performing inference including statistical modeling, data oriented strategies and explicit use of designs and randomization in analyses. Furthermore, there are broad theories (frequentists, Bayesian, likelihood, design based, …) and numerous complexities (missing data, observed and unobserved confounding, biases) for performing inference.

Jun 8th 2026
4 Weeks