Statistical Modelling
Autumn semester 2021
General information
Lecturer | Christina Heinze-Deml |
---|---|
Assistant | Juan Gamella |
Lectures | Mon 10-12 ML D 28 |
Thu 14-16 HG E 1.1 | |
Course catalogue data | >> |
Course content
In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, high-dimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get hands-on experience with this.
Announcements
- September 17th 2021:
- The lectures will be held in hybrid mode: The lectures will be held in presence and also be livestreamed and recorded via zoom. The zoom details can be found on Moodle.
- Starting on September 30th, the exercise classes will take place every second Thursday. The first exercise session will include an introduction to the statistical programming language R with some exercises. In the exercise sessions, you can solve the R problems, the series and ask questions. On Mondays there will be a lecture every week and the class on Thursday will alternate between lectures and exercise sessions (exceptions will be announced).
- We will be using the ETH EduApp during the lectures for clicker questions. Please install it on your phone or tablet (iOS, Android). If this is not possible, you can also access the EduApp via the Web App.
Course materials
- Zoom link and lecture recordings: See Moodle.
- The handwritten notes made during the lectures can be found here. If you have trouble accessing these, please let us know.
- The datasets used in the R scripts shown during the lectures can be found here.
- Two old exams are made available here. Note: The course was then taught by a different lecturer and some of the covered material differs.
Week | Topic |
---|---|
Week 1 | Introduction |
Week 2 | Classical linear model
|
Week 2 - exercise | Introduction to R |
Week 3 - I | Classical linear model
|
Week 3 - II | Classical linear model
|
Week 4 - I | Classical linear model
|
Week 4 - exercise | Exercise |
Week 5 - I | Hypothesis testing
|
Week 5 - II | Hypothesis testing and confidence intervals
|
Week 6 | Multiple testing I
|
Week 6 - exercise | Exercise |
Week 7 - I | Multiple testing II
|
Week 7 - II | Prediction intervals and model selection
|
Week 8 - I | Model selection
|
Week 8 - exercise | Tutorial - Linear algebra review: Positive-definite matrices |
Week 9 - I | Model diagnostics
|
Week 9 - II | Model diagnostics
|
Week 10 - I | General linear model, weighted least squares and robust regression
|
Week 10 - II | Tutorial - clicker questions on multiple testing |
Week 11 - I | Cancelled - no class |
Week 11 - II | Robust regression and generalized linear models
|
Week 12- I | Generalized linear models
|
Week 12 - II | Tutorial: ROC curves |
Week 13 - I | Generalized linear models and penalized regression
|
Week 13 - II | Penalized regression
|
Week 14 - I | Knockoffs
|
Week 14 - II | Tutorial: Subdifferentials, exponential families |
Software
Examples in the lecture as well as solutions to the exercises will be based on the statistical software R. R is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here. The most commonly used editor for R is RStudio which can be downloaded from here.
Exercise classes
Exercise classes will take place every other week on Thursdays. The first exercise class on September 30th will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes.
If you would like the TA to go over a particular topic or exercise during the tutorial, please write an email or post your question here in advance.
Would you like something done differently? You can leave anonymous feedback here.
Series
Exercises | Solutions | Due date |
---|---|---|
R Series | R Solutions | None |
Series 1 | Solutions 1 | October 14th 2021 |
Series 2 | Solutions 2 | October 28th 2021 |
Series 3 | Solutions 3 (updated Nov 17) | 14:00, November 9th 2021 |
Series 4 (updated Nov 12) | Solutions 4 | 14:00, November 23rd 2021 |
Series 5 | Solutions 5 | 14:00, November 30th 2021 |
Series 6 | Solutions 6 | 14:00, December 21st 2021 |
Series 7 | Solutions 7 | None |
Literature
- L. Fahrmeir, T. Kneib, S. Lang and B. Marx (2013), Regression - Models, Methods and Applications. Springer.
- T. Hastie, R. Tibshirani, and J. Friedman (2009), The Elements of Statistical Learning [ESL]. 2nd edition, Springer.
- G. James, D. Witten, T. Hastie, R. Tibshirani (2021). An Introduction to Statistical Learning: with Applications in R [ISLR]. 2nd edition Springer.
- Script by Peter Bühlmann, Nicolai Meinshausen and Hans-Rudolf Künsch.
- S. Weisberg (2005). Applied Linear Regression. 3rd edition, Wiley.