Computational Statistics
Spring semester 2019
General information
Lecturer | Marloes Maathuis |
---|---|
Assistants | Jeffrey Näf, Jinzhou Li, Domagoj Ćevid |
Lectures | Thu 13-15
HG
F 3
>>
Fri 09-10 HG G 3 >> |
Exercises | Fri 10-11 HG F 3 >> |
Question hour | Fri 11-12 HG F 3 >> |
Course catalogue data | >> |
Credit points | In order to obtain ECTS credit points, you need to pass the official exam. In order to obtain ETH credit points (this only applies to doctoral students from certain departments), you need to solve and submit at least 80% of the exercises. |
Course content
We will study modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. Special emphasis will be placed on re-sampling based methods for inference. The course is hands-on, and methods are applied using the statistical programming language R.
The material for each week will be indicated under the Course material tab (reading material, slides and R-code). See also the Literature tab for the main books that we will use. Exercises and related material can be found under the Exercise tab.
The exam will cover all material that is discussed during the lectures and the exercise sessions. If you miss a class, please make sure to copy class notes from someone else.
All discussions and announcements regarding the course can be found in the Moodle forum.
Course material
Click here to download all materials from the lectures.
Week | Topic |
---|---|
Week 1 | Linear regression |
Week 2 | Linear regression |
Week 3 | Confidence intervals and Bias-variance tradeoff |
Week 4 | kNN, Cross validation |
Week 5 | Bootstrap
|
Week 6 | Bootstrap |
Week 7 | Bootstrap and Simulation tests |
Week 8 | Permutation tests and Multiple testing |
Week 9 | Model Selection |
Week 10 | Model selection and inference after model selection |
Week 11 | Beyond linearity |
Week 12 | Tree based methods |
Week 13 | Bagging and Random forest |
Exercise classes
The exercises form a very important part of the course. They are also important for exam preparation, since part of the exam requires the use of R. If possible, please always bring a laptop to the exercise classes. During the exercise hour, the assistants will discuss the last series and introduce the new series, often showing some example code as well. There is a "Präsenzstunde" directly following the exercise hour, which is meant to work on the exercises yourself and to ask questions about anything related to the exercises. For example, you can ask about the solutions for last week's series, or get help with the current series. We will make sure that there are several assistants to answer your questions.
Unless you are a PhD student who requires ETH credit points, you do not have to hand in your exercises. We will provide solutions, and you are expected to check your own work. Please ask if anything is unclear, for example if you don't understand the solutions, or if you found a different solution.
PhD students who require ETH credit points should email the solved exercises to compstat@stat.math.ethz.ch by 10am on Friday, a week after the pre-discussion.
Click here to download all materials from the exercise classes.
Exercise sheets
The new exercise sheet will be uploaded on Tuesday preceding the corresponding session.
Exercises | Discussion | Deadline | Materials |
---|---|---|---|
R Tutorial | February 22, 2019 | R-Tutorial, R Code | |
Series 1 (updated March 1) | March 1, 2019 | March 8, 2019 | R Code |
Series 2 | March 8, 2019 | March 15, 2019 | R Code |
Series 3 | March 15, 2019 | March 22, 2019 | R Code |
Series 4 | March 22, 2019 | March 29, 2019 | R Code |
Series 5 (updated August 14), R Skeleton | March 29, 2019 | April 5, 2019 | R Code |
Series 6 | April 5, 2019 | April 12, 2019 | R Code |
Series 7, Data | April 12, 2019 | May 3, 2019 | R Code |
Series 8 | May 3, 2019 | May 10, 2019 | R code |
Series 9 | May 10, 2019 | May 17, 2019 | R code |
Series 10 | May 17, 2019 | May 24, 2019 | R code |
Series 11 | May 24, 2019 | May 31, 2019 | R code |
Solutions
The solutions will be sent weekly to the students enrolled for this course.
Literature
Main literature
- P. Bühlmann, M. Mächler. Script Computational Statistics. (Version of October 12, 2016).
- G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning: with Applications in R [ISLR]. Springer.
- T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning [ESL]. Springer.
Other relevant literature
- C. R. Shalizi. Advanced Data Analysis from an Elementary Point of View.
- B. Efron, T. Hastie. Computer Age Statistical Inference. Cambridge University Press.
- J. E. Gentle. Elements of Computational Statistics. Springer.
- W. N. Venables, B. D. Ripley. Modern Applied Statistics with S. Springer.
- L. Wasserman. All of statistics. Springer.