Statistical Modelling
Autumn semester 2020
General information
Lecturer  Peter Bühlmann 

Assistant  Leonard Henckel 
Lectures  Tue 1012 HG E 5 
Thu 1416 HG E 7  
Course catalogue data  >> 
Course content
In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, highdimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get handson experience with this. You will also learn to interpret and critique regression analyses done by others.
Announcements
Starting on September 24th, the exercise classes will take place every second Thursday. The first exercise session will include an introduction to the statistical programming language R with some exercises. In the exercise sessions, you can solve the R problems, the series and ask questions. You need to bring your own laptop for solving the R questions. On Tuesdays there will be a lecture every week and the class on Thursday will alternate between lectures and exercise sessions (exceptions will be announced). Please check this course website regularly for announcements regarding the schedule. The first lecture will be on September 15th.
Starting from November 5th, the exercise classes will take place via zoom. Please join by using the zoom details provided on the course's moodle page 1.
Course materials
The datasets used in the R scripts shown during the lectures can be found here.
Two old Exams are made available here.
Week  Topic 

Week 1  I  Introduction 
Week 1  II  Classical linear model 
Week 2  Classical linear model

Week 3  I  Classical linear model

Week 3  II  Classical linear model

Week 4  Hypothesis testing

Week 5  I  Hypothesis testing and confidence intervals

Week 5  II  Confidence intervals and model selection

Week 6  Model selection and the GaussMarkov theorem

Week 7  I  Model selection and Logistic regression

Week 7  II  Logistic regression

Week 8  Generalized linear models

Week 9  I  Nonlinear least squares and hypothesis testing

Week 9  II  Nonparametric regression

Week 10  Nonparametric regression and crossvalidation

Week 11  I  Nonparametric regression 
Week 11  II  QAsession 
Week 12  I  Lasso

Week 13  I  QAsession and Lasso

Week 13  II  Lasso and robust regression 
Week 14  I  Robust regression

Software
Examples in the lecture as well as solutions to the exercises will be based on the statistical software R. R is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here. The most commonly used editor for R is RStudio which can be downloaded from here.
Exercise classes
Exercise classes will take place every other week on Thursdays. The first exercise class on September 24th will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes.
Series
If you are a PhD student who needs ETH credit points, the submission of four exercise series is mandatory. If this applies to you, please email your solutions to the assistants or place them in the corresponding tray in HG J 68. Students who need ECTS credit points have to take the exam.
Exercises  Solutions  Due date 

Rexercise  Rsolutions  
Series 1  Solutions 1  08.10.2020 
Series 2  Solutions 2  22.10.2020 
Series 3  Solutions 3  5.11.2020 
Series 4  Solutions 4  19.11.2020 
Series 5  Solutions 5  3.12.2020 
Series 6  Solutions 6  17.12.2020 
Materials
Literature
 L. Fahrmeir, T. Kneib, S. Lang and B. Marx (2013), Regression  Models, Methods and Applications. Springer.
 T. Hastie, R. Tibshirani, and J. Friedman (2009), The Elements of Statistical Learning [ESL]. 2nd edition, Springer.
 G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning: with Applications in R [ISLR]. Springer.
 Script by Peter Bühlmann, Nicolai Meinshausen and HansRudolf Künsch.
 additional Notes by Peter Bühlmann on Heteroscedastic errors and robust inference.
 S. Weisberg (2005). Applied Linear Regression. 3rd edition, Wiley.