Statistical Modelling
Autumn semester 2022
General information
Lecturer  Peter Bühlmann 

Assistant  Malte Londschien 
Lectures  Mon 1012 ML D 28 
Thu 1416 HG E 1.1  
Course catalogue data  >> 
Course content
In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, highdimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get handson experience with this.
Announcements

September 13th 2022

First lecture on September 22nd
The first lecture is on Thursday, September 22nd, starting at 14:15 in HG E 1.1. 
Frequency of lectures and exercises
There will be a lecture every Monday at 10:15 in ML D 28. Lectures and exercises alternate on Thursdays at 14:15 in HG E 1.1. Exercises and will take place each second week, starting on on September 29th. Exceptions will be announced during the lecture and on this website. 
Content of the first exercises session
The first exercise session will include an introduction to the statistical programming language R with some exercises. In the exercise sessions, you can solve the R problems, the series and ask questions. You need to bring your own laptop for solving the R questions. 
Recording of lectures
The lectures will not be streamed via Zoom. The lectures will be recorded and uploaded to the ETH video portal after each lecture.

First lecture on September 22nd

October 8th 2022

No class and no exercise class on October 10th and October 13th.
Please read chapters 1.6.1, 1.6.3, and 1.6.4 of the script and solve the exercise series 2. If you have any questions regarding the exercises, use the Moodle Overflow forum or send an email to Malte Londschien. The reading material will be discussed in the next lecture on October 17th.

No class and no exercise class on October 10th and October 13th.

November 8th 2022

Switch of lecture and exercise Thursday, November 10th and Monday, November 14th.
There will be a lecture on the next Thursday, November 10th. The exercise class will take place on Monday, November 14th. The rooms remain the same.

Switch of lecture and exercise Thursday, November 10th and Monday, November 14th.

December 5th 2022

Two more exams, both from 2021, have been made available.
See here.

Two more exams, both from 2021, have been made available.

December 9th 2022

Course schedule for the end of the year
There will be an exercise class on Monday, 12th of December. There will be no lecture or exercise class on Thursday, 15th of December. There will be a last exercise class on Thursday, 22nd of December.
January 16th 2023 
Course schedule for the end of the year

Solutions for exam sample questions
We uploaded the solutions for the exam sample questions.
Course materials
 Lecture recordings: See the ETH video portal.
 Moodle: Please use the Q&A Moodle Overflow Forum to ask questions.
 The datasets used in the R scripts shown during the lectures can be found here (old).
 Four old exams are made available from 2018/2019 and 2021. Note: Some of the covered material may differ.
 We provide sample questions and their solutions for the exam.
 Scans of the visualizer will be uploaded here.
 Slides will be uploaded here.
Week  Topic 

Week 1  Introduction 
Week 2  Classical linear model 
Week 3  Classical linear model 
Week 4  Classical linear model

Week 5  Classical linear model 
Week 6  Classical linear model 
Week 7  Classical linear model 
Week 8  Classical linear model 
Week 9  Classical linear model 
Week 10  Generalized linear models 
Week 11  Generalized linear models 
Week 12  Nonparametric models 
Week 13  Highdimensional models 
Software
Examples in the lecture as well as solutions to the exercises will be based on the statistical software R. R is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here. The most commonly used editor for R is RStudio which can be downloaded from here.
Exercise classes
Exercise classes will take place every other week on Thursdays. The first exercise class on September 29th will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes.
Series
If you are a PhD student who needs ETH credit points, the submission of four exercise series is mandatory. If this applies to you, please contact the assistant.
Exercises  Solutions 

R Series  R Solutions 
Series 1  Solutions 1 
Series 2  Solutions 2 
Series 3  Solutions 3 
Series 4  Solutions 4 
Series 5  Solutions 5 
Series 6  Solutions 6 
Materials
Week  Materials 

Week 2  RIntroduction 
Week 8  
Week 9 
Literature
 L. Fahrmeir, T. Kneib, S. Lang and B. Marx (2013), Regression  Models, Methods and Applications. Springer.
 T. Hastie, R. Tibshirani, and J. Friedman (2009), The Elements of Statistical Learning [ESL]. 2nd edition, Springer.
 G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning: with Applications in R [ISLR]. Springer.
 Script (to be updated) by Peter Bühlmann, Nicolai Meinshausen and HansRudolf Künsch.
 additional Notes by Peter Bühlmann on Heteroscedastic errors and robust inference.
 S. Weisberg (2005). Applied Linear Regression. 3rd edition, Wiley.