Statistical Modelling
Autumn semester 2021
General information
Lecturer  Christina HeinzeDeml 

Assistant  Juan Gamella 
Lectures  Mon 1012 ML D 28 
Thu 1416 HG E 1.1  
Course catalogue data  >> 
Course content
In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, highdimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get handson experience with this.
Announcements
 September 17th 2021:
 The lectures will be held in hybrid mode: The lectures will be held in presence and also be livestreamed and recorded via zoom. The zoom details can be found on Moodle.
 Starting on September 30th, the exercise classes will take place every second Thursday. The first exercise session will include an introduction to the statistical programming language R with some exercises. In the exercise sessions, you can solve the R problems, the series and ask questions. On Mondays there will be a lecture every week and the class on Thursday will alternate between lectures and exercise sessions (exceptions will be announced).
 We will be using the ETH EduApp during the lectures for clicker questions. Please install it on your phone or tablet (iOS, Android). If this is not possible, you can also access the EduApp via the Web App.
Course materials
 Zoom link and lecture recordings: See Moodle.
 The handwritten notes made during the lectures can be found here. If you have trouble accessing these, please let us know.
 The datasets used in the R scripts shown during the lectures can be found here.
 Two old exams are made available here. Note: The course was then taught by a different lecturer and some of the covered material differs.
Week  Topic 

Week 1  Introduction 
Week 2  Classical linear model

Week 2  exercise  Introduction to R 
Week 3  I  Classical linear model

Week 3  II  Classical linear model

Week 4  I  Classical linear model

Week 4  exercise  Exercise 
Week 5  I  Hypothesis testing

Week 5  II  Hypothesis testing and confidence intervals

Week 6  Multiple testing I

Week 6  exercise  Exercise 
Week 7  I  Multiple testing II

Week 7  II  Prediction intervals and model selection

Week 8  I  Model selection

Week 8  exercise  Tutorial  Linear algebra review: Positivedefinite matrices 
Week 9  I  Model diagnostics

Week 9  II  Model diagnostics

Week 10  I  General linear model, weighted least squares and robust regression

Week 10  II  Tutorial  clicker questions on multiple testing 
Week 11  I  Cancelled  no class 
Week 11  II  Robust regression and generalized linear models

Week 12 I  Generalized linear models

Week 12  II  Tutorial: ROC curves 
Week 13  I  Generalized linear models and penalized regression

Week 13  II  Penalized regression

Week 14  I  Knockoffs

Week 14  II  Tutorial: Subdifferentials, exponential families 
Software
Examples in the lecture as well as solutions to the exercises will be based on the statistical software R. R is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here. The most commonly used editor for R is RStudio which can be downloaded from here.
Exercise classes
Exercise classes will take place every other week on Thursdays. The first exercise class on September 30th will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes.
If you would like the TA to go over a particular topic or exercise during the tutorial, please write an email or post your question here in advance.
Would you like something done differently? You can leave anonymous feedback here.
Series
Exercises  Solutions  Due date 

R Series  R Solutions  None 
Series 1  Solutions 1  October 14th 2021 
Series 2  Solutions 2  October 28th 2021 
Series 3  Solutions 3 (updated Nov 17)  14:00, November 9th 2021 
Series 4 (updated Nov 12)  Solutions 4  14:00, November 23rd 2021 
Series 5  Solutions 5  14:00, November 30th 2021 
Series 6  Solutions 6  14:00, December 21st 2021 
Series 7  Solutions 7  None 
Literature
 L. Fahrmeir, T. Kneib, S. Lang and B. Marx (2013), Regression  Models, Methods and Applications. Springer.
 T. Hastie, R. Tibshirani, and J. Friedman (2009), The Elements of Statistical Learning [ESL]. 2nd edition, Springer.
 G. James, D. Witten, T. Hastie, R. Tibshirani (2021). An Introduction to Statistical Learning: with Applications in R [ISLR]. 2nd edition Springer.
 Script by Peter Bühlmann, Nicolai Meinshausen and HansRudolf Künsch.
 S. Weisberg (2005). Applied Linear Regression. 3rd edition, Wiley.