Computational Statistics

Spring semester 2019

General information

Lecturer Marloes Maathuis
Assistants Jeffrey Näf, Jinzhou Li, Domagoj Ćevid
Lectures Thu 13-15 HG F 3 >>
Fri 09-10 HG G 3 >>
Exercises Fri 10-11 HG F 3 >>
Question hour Fri 11-12 HG F 3 >>
Course catalogue data >>
Credit points In order to obtain ECTS credit points, you need to pass the official exam. In order to obtain ETH credit points (this only applies to doctoral students from certain departments), you need to solve and submit at least 80% of the exercises.

Course content

We will study modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. Special emphasis will be placed on re-sampling based methods for inference. The course is hands-on, and methods are applied using the statistical programming language R.

The material for each week will be indicated under the Course material tab (reading material, slides and R-code). See also the Literature tab for the main books that we will use. Exercises and related material can be found under the Exercise tab.

The exam will cover all material that is discussed during the lectures and the exercise sessions. If you miss a class, please make sure to copy class notes from someone else.

All discussions and announcements regarding the course can be found in the Moodle forum.

Course material

Click here to download all materials from the lectures.

Week Topic
Week 1 Linear regression
Week 2 Linear regression
Week 3 Confidence intervals and Bias-variance tradeoff
Week 4 kNN, Cross validation
Week 5 Bootstrap
Week 6 Bootstrap
Week 7 Bootstrap and Simulation tests
Week 8 Permutation tests and Multiple testing
Week 9 Model Selection
Week 10 Model selection and inference after model selection
Week 11 Beyond linearity
Week 12 Tree based methods
Week 13 Bagging and Random forest

Exercise classes

The exercises form a very important part of the course. They are also important for exam preparation, since part of the exam requires the use of R. If possible, please always bring a laptop to the exercise classes. During the exercise hour, the assistants will discuss the last series and introduce the new series, often showing some example code as well. There is a "Präsenzstunde" directly following the exercise hour, which is meant to work on the exercises yourself and to ask questions about anything related to the exercises. For example, you can ask about the solutions for last week's series, or get help with the current series. We will make sure that there are several assistants to answer your questions.

Unless you are a PhD student who requires ETH credit points, you do not have to hand in your exercises. We will provide solutions, and you are expected to check your own work. Please ask if anything is unclear, for example if you don't understand the solutions, or if you found a different solution.

PhD students who require ETH credit points should email the solved exercises to compstat@stat.math.ethz.ch by 10am on Friday, a week after the pre-discussion.

Click here to download all materials from the exercise classes.

Exercise sheets

The new exercise sheet will be uploaded on Tuesday preceding the corresponding session.

Exercises Discussion Deadline Materials
R Tutorial February 22, 2019 R-Tutorial, R Code
Series 1 (updated March 1) March 1, 2019 March 8, 2019 R Code
Series 2 March 8, 2019 March 15, 2019 R Code
Series 3 March 15, 2019 March 22, 2019 R Code
Series 4 March 22, 2019 March 29, 2019 R Code
Series 5 (updated August 14), R Skeleton March 29, 2019 April 5, 2019 R Code
Series 6 April 5, 2019 April 12, 2019 R Code
Series 7, Data April 12, 2019 May 3, 2019 R Code
Series 8 May 3, 2019 May 10, 2019 R code
Series 9 May 10, 2019 May 17, 2019 R code
Series 10 May 17, 2019 May 24, 2019 R code
Series 11 May 24, 2019 May 31, 2019 R code

Solutions

The solutions will be sent weekly to the students enrolled for this course.

Literature

Main literature

Other relevant literature