Statistical Modelling

Autumn semester 2020

General information

Lecturer Peter Bühlmann
Assistant Leonard Henckel
Lectures Tue 10-12 HG E 5
Thu 14-16 HG E 7
Course catalogue data >>

Course content

In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, high-dimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get hands-on experience with this. You will also learn to interpret and critique regression analyses done by others.

Announcements

  • August 23rd 2020:
    Starting on September 24th, the exercise classes will take place every second Thursday. The first exercise session will include an introduction to the statistical programming language R with some exercises. In the exercise sessions, you can solve the R problems, the series and ask questions. You need to bring your own laptop for solving the R questions. On Tuesdays there will be a lecture every week and the class on Thursday will alternate between lectures and exercise sessions (exceptions will be announced). Please check this course website regularly for announcements regarding the schedule. The first lecture will be on September 15th.
  • November 4th 2020:
    Starting from November 5th, the exercise classes will take place via zoom. Please join by using the zoom details provided on the course's moodle page 1.
  • Course materials

    The datasets used in the R scripts shown during the lectures can be found here.

    Two old Exams are made available here.

    -->
    Week Topic
    Week 1 - I Introduction
    Week 1 - II Classical linear model
    Week 2 Classical linear model
    • Script chapters 1.3
    • Notes: 1
    Week 3 - I Classical linear model
    • Script chapters 1.3.4 to 1.4.3
    • Notes: 1
    Week 3 - II Classical linear model
    • Script chapters 1.4.3 to 1.5.1
    • Notes: 1
    Week 4 Hypothesis testing
    • Script chapters 1.5.2 to 1.6.2
    Week 5 - I Hypothesis testing and confidence intervals
    • Script chapters 1.6.2 to 1.7.5
    • Notes: 1
    Week 5 - II Confidence intervals and model selection
    • Script chapters 1.8 to 1.9
    • Notes: 1
    Week 6 Model selection and the Gauss-Markov theorem
    • Script chapters 1.8 to 1.9
    • Notes: 1
    Week 7 - I Model selection and Logistic regression
    • Script chapters 1.8 and 2.3.1
    • Notes: 1
    Week 7 - II Logistic regression
    • Script chapters 2.3.1
    • Notes: 1
    Week 8 Generalized linear models
    • Script chapter 2.3
    • Notes: 1
    Week 9 - I Nonlinear least squares and hypothesis testing
    • Script chapter 2.2
    • Notes: 1
    Week 9 - II Non-parametric regression
    • Script chapter 2.5.1
    • Notes: 1
    Week 10 Non-parametric regression and cross-validation
    • Script chapter 2.5.2
    • Notes: 1
    Week 11 - I Non-parametric regression
    • Script chapter 2.5
    • Slides: 1
    • Notes: 1
    Week 11 - II QA-session
    Week 12 - I Lasso
    • Notes: 1
    Week 13 - I QA-session and Lasso
    Week 13 - II Lasso and robust regression
    Week 14 - I Robust regression

    Software

    Examples in the lecture as well as solutions to the exercises will be based on the statistical software R. R is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here. The most commonly used editor for R is RStudio which can be downloaded from here.

    Exercise classes

    Exercise classes will take place every other week on Thursdays. The first exercise class on September 24th will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes.

    Series

    If you are a PhD student who needs ETH credit points, the submission of four exercise series is mandatory. If this applies to you, please email your solutions to the assistants or place them in the corresponding tray in HG J 68. Students who need ECTS credit points have to take the exam.

    <-->
    Exercises Solutions Due date
    R-exercise R-solutions
    Series 1 Solutions 1 08.10.2020
    Series 2 Solutions 2 22.10.2020
    Series 3 Solutions 3 5.11.2020
    Series 4 Solutions 4 19.11.2020
    Series 5 Solutions 5 3.12.2020
    Series 6 Solutions 6 17.12.2020

    Materials

    Week Materials
    Week 2 R-Introduction
    Week 4 Hypothesis testing
    • R code: 1
    Week 6 Model selection
    • R code: 1 2
    Week 8 Hypothesis testing and GLMs
    • R code: 1 2
    Week 10 Residual analysis and outliers
    Week 12 Nonparametric regression and cross-validation
    • R code: 1 2
    Week 12 Model selection and instrumental variable estimators
    • R code: 1

    Literature