Applied Statistical Regression

Autumn semester 2020

General information

Lecturer Marcel Dettling
Assistants Drago Plecko
Lectures (Online) Mon 08-10
Exercises (Online) Mon 10-12 (bi-weekly)
Course catalogue data >>
Literature

Faraway (2005): Linear Models with R
Faraway (2006): Extending the Linear Model with R
Draper & Smith (1998): Applied Regression Analysis
Fox (2008): Applied Regression Analysis and GLMs
Montgomery et al. (2006): Introduction to Linear Regression Analysis

Course content

Abstract

This course offers a practically oriented introduction into regression modeling methods. The basic concepts and some mathematical background are included, with the emphasis lying in learning "good practice" that can be applied in every student's own projects and daily work life. A special focus will be laid in the use of the statistical software package R for regression analysis.

Objective

The students acquire advanced practical skills in linear regression analysis and are also familiar with its extensions to generalized linear modeling.

Content

The course starts with the basics of linear modeling, and then proceeds to parameter estimation, tests, confidence intervals, residual analysis, model choice, and prediction. More rarely touched but practically relevant topics that will be covered include variable transformations, multicollinearity problems and model interpretation, as well as general modeling strategies.

The last third of the course is dedicated to an introduction to generalized linear models: this includes the generalized additive model, logistic regression for binary response variables, binomial regression for grouped data and Poisson regression for count data.

Notice

The exercises, but also the classes will be based on procedures from the freely available, open-source statistical software package R, for which an introduction will be held.

Announcements

  • September 15, 2019:
    Beginning of lectures and exercise classes: Monday, 21.09.2020.

Course materials

All course materials are available via Moodle (only accessible to students who are enrolled via ETH MyStudies).

Course organisation

The following table contains a tentative outline of the course, but changes might apply.

Week Topic
Week 1 (14.09.2020) No lecture
Week 2 (21.09.2020) Linear Modeling and Smoothing
Week 3 (28.09.2020) Simple Linear Regression: Fitting and Inference
Week 4 (05.10.2020) Curvilinear Models, Variable Transformations
Week 5 (12.10.2020) Multiple Linear Regression: Model and Fitting
Week 6 (19.10.2020) Multiple Linear Regression: Inference and Prediction
Week 7 (26.10.2020) Extensions: Categorical Variables, Interactions
Week 8 (02.11.2020) Model Diagnostics: Standard Residual Plots
Week 9 (09.11.2020) Model Diagnostics: Advanced Techniques
Week 10 (16.11.2020) Multicollinearity and Variable Selection
Week 11 (23.11.2020) Modeling Strategies, Cross Validation
Week 12 (30.11.2020) Generalized Additive Modeling (GAM)
Week 13 (07.12.2020) Generalized Linear Modeling (GLM)
Week 14 (14.12.2020) Grouped Data, Poisson Regression

Exercise classes

Exercises will be held bi-weekly. On these dates, the exercise classes will take place from 10:15 to 11:55 online via Zoom (the link will be posted on Moodle (only accessible to students who are enrolled via ETH MyStudies)). The first exercise class is meant to be an opportunity for you to ask questions regarding the software R. The material you should be familiar with consists of the R tutorial and exercise sheet 1. Also further on, R will be used during the exercises so that you are expected to bring your laptop to the classes. Starting with the second exercise class, the idea is that there will be a discussion of the old exercise sheet (common problems) and a discussion of the new exercise sheet (hints and theory as needed) taking at most one hour. In the second hour, you will be allowed to ask any questions relating to the course. The exercise classes will take place on the following dates:

  • September 21, 2019
  • October 05, 2019
  • October 19, 2019
  • November 02, 2019
  • November 16, 2019
  • November 30, 2019
  • December 14, 2019

Series and solutions

The series and their solutions will be shared via Moodle. PhD students who want to get credits for the course should hand in their exercises. The solved exercises should be sent to the course assistant before the respective deadline.

Exercises Solutions Due date
Series 1 (Introduction to R)
Solution 1 NA
Series 2 (Smoothing) Solution 2
September 28, 2020
Series 3 (Simple Regression)
Solution 3
October 12, 2020
Series 4 (Multiple Regression)
Solution 4 October 26, 2020
Series 5 (Model Diagnostics)
Solution 5
November 09, 2020
Series 6 (Variable Selection)
Solution 6
November 23, 2020
Series 7 (Modeling Strategies)
Solution 7 December 07, 2020
Series 8 (GLM)
Solution 8 NA

Help with R

During the first exercise class you will have the opportunity to ask questions regarding the software R. Further material can be found following the links below.

R homepage
R studio homepage
Getting help with R
Online R course (in German)
Try R