Learn to use tools from the Bioconductor project to perform analysis of genomic data. This is the fifth course in the Genomic Big Data Specialization from Johns Hopkins University.
Welcome to the Advanced Linear Models for Data Science Class 2: Statistical Linear Models. This class is an introduction to least squares from a linear algebraic and mathematical perspective.
Before beginning the class make sure that you have the following:
- A basic understanding of linear algebra and multivariate calculus.
- A basic understanding of statistics and regression models.
- At least a little familiarity with proof based mathematics.
- Basic knowledge of the R programming language.
After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists' general understanding of regression models.
Who is this class for: This class is for students who already have had a class in regression modeling and are familiar with the area who would like to see a more advanced treatment of the topic.
Introduction and expected values
In this module, we cover the basics of the course as well as the prerequisites. We then cover the basics of expected values for multivariate vectors. We conclude with the moment properties of the ordinary least squares estimates.
Graded: Expected Values
The multivariate normal distribution
In this module, we build up the multivariate and singular normal distribution by starting with iid normals.
Graded: the multivariate normal
In this module, we build the basic distributional results that we see in multivariable regression.
Graded: Distributional results
In this module we will revisit residuals and consider their distributional results. We also consider the so-called PRESS residuals and show how they can be calculated without re-fitting the model.