Inference on a numeric response, in the presence of predictors.
By the end of the course, students are expected to:
- Fit a linear regression model using R and
broom
. - Interpret how predictors influence a response using a fitted linear regression model.
- Identify whether a linear regression model is appropriate for a given dataset.
- Identify use cases of linear regression
- Evaluate the fit of a regression model using residual plots
- Evaluate the fit of a regression model using appropriate measures of model goodness (MSE and R-squared), and drawing the connection back to the null model.
- Quantify estimation error vs. prediction error in the presence of predictors, and understand the decomposition of error in each case.
- Understand the effect of multicollinearity on an OLS estimate.
- Convert categorical predictors for use in a linear regression model.
This is an assignment-based course. You'll be evaluated as follows:
Assessment | Weight | Deadline | Submit to... |
---|---|---|---|
Lab Assignment 1 | 15% | Saturday, Nov 24 at 18:00 | Github |
Lab Assignment 2 | 15% | Saturday, Dec 1 at 18:00 | Github |
Lab Assignment 3 | 15% | Saturday, Dec 8 at 18:00 | Github |
Lab Assignment 4 | 15% | Wed, Dec 12 at 18:00 | Github |
Quiz 1 | 20% | Monday, Sept 24, 14:00-14:30 | TBD (aiming for Canvas) |
Quiz 2 | 20% | Thursday, December 13 | TBD (aiming for Canvas) |
Lecture | Topic |
---|---|
1 | Review of statistical inference, connection between 2-samples t-test, ANOVA and linear regression |
2 | Linear model in general matrix notation, different type of predictors, interpretation of coefficients and parametrizations, estimation and inference |
3 | Continuous and categorical predictors, interaction term, interpretation of coefficients, estimation and inference |
4 | Least squares estimation, fitted values, residuals, confidence intervals |
5 | Multiple linear regression, out-of-sample predictions, prediction intervals |
6 | Goodness of fit, estimation error, prediction error |
7 | Transformations, multicollinearity, diagnostics, unusual and influential data |
8 | Bootstrapping |
- Intro to Statistical Learning (ISLR), especially Chapter 3.
- A modern and approachable take on statistics / machine learning.
- R for Data Science (r4ds), especially Part IV.
- Practical and approachable book on the use of R for data science.
- Linear Models with R
- Comprehensive book on linear models.
- OpenIntro Statistics
- Fairly accessible, seems to lean towards a traditional approach. Chapters 7 & 8 are relevant for linear regression.
Please see the general MDS policies.