Computational Statistics
Spring semester 2018
General information
Lecturer | Marloes Maathuis |
---|---|
Assistants | Peter Hinz, Loris Michel, Claude Renaux |
Lectures | Thu 13-15
HG
F 3
>>
Fri 09-10 HG G 3 >> |
Exercises | Fri 10-12 HG F 3 >> |
Course catalogue data | >> |
Credit points | In order to obtain ECTS credit points, you need to pass the official exam. In order to obtain ETH credit points (this only applies to doctoral students from certain departments), you need to solve and submit at least 70% of the exercises. |
Course content
We will study modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. Special emphasis will be placed on re-sampling based methods for inference. The course is hands-on, and methods are applied using the statistical programming language R.
The main text for the course is the following script. We will also use material from books (see Literature). The material for each week will be indicated under the Course material tab.
The exam will cover all material that is discussed during the lectures and the exercise sessions. If you miss a class, please make sure to copy class notes from someone else.
Announcements
-
First week
Beginning of lecture: Thursday, 22/02/2018.Exercises start on 23/02/2018 at 10:15 in room HG F3, with a special introduction to the R software. Please bring a laptop. See the Exercises tab and further helpful links related to R.
-
Special dates:
- Different room: The class on March 29th takes place in HG E 3.
- Holidays: There will be no class on March 30th, April 5th, April 6th and May 10th.
- Cancellation: The class and exercises on May 11th are cancelled.
Course material
Week | Topic |
---|---|
Week 1 | Linear regression |
Week 2 |
Linear regression
|
Week 3 | Confidence intervals, and bias-variance trade-off |
Week 4 | KNN, Cross validation |
Week 5 | Bootstrap
|
Week 6 | Bootstrap
|
Week 7 | no class |
Week 8 | Bootstrap and Monte Carlo tests |
Week 9 | Permutation tests |
Week 10 | Multiple testing and model selection |
Week 11 | Inference after model selection, moving beyond linearity |
Week 12 | no class |
Week 13 | Beyond linearity |
Week 14 | Tree-based methods |
Week 15 | Bagging and boosting |
Exercise classes
The exercises form a very important part of the course. They are also important for exam preparation, since part of the exam requires the use of R. If possible, please always bring a laptop to the exercise classes. After a brief introduction by one of the assistants, you can work on the exercises yourself or with others, and ask any questions you may have (about the exercises or the course material). There will be several assistants available to answer your questions.
Unless you are a PhD student who requires ETH credit points, you do not have to hand in your exercises. We will provide solutions, and you are expected to check your own work. Please ask if anything is unclear, for example if you don't understand the solutions, or if you found a different solution.
PhD students who require ETH credit points should place the solved exercises in the corresponding tray in HG J68 on the due date by noon at the latest.
Exercise sheets
The new exercise sheet will be uploaded on Tuesday preceding the corresponding session.
Exercises | Discussion | Deadline | Slides / Notes |
---|---|---|---|
February 23, 2018 | R-Tutorial, R Code | ||
Series 1 | March 2, 2018 | March 9, 2018 | R Code for discussion |
Series 2 (updated March 19) | March 9, 2018 | March 15, 2018 | R Code for discussion |
Series 3 (updated March 26) | March 16, 2018 | March 22, 2018 | R Code for discussion (Shalizi, Appendix N) |
Series 4 (updated March 26) | March 23, 2018 | March 29, 2018 | R Code for discussion |
Series 5, R-skeleton | April 13, 2018 | April 19, 2018 | R Code for discussion |
Series 6 | April 20, 2018 | April 26, 2018 | R Code for discussion |
Series 7 (updated May 18), data | April 27, 2018 | May 3, 2018 | R Code for discussion |
Series 8 | May 4, 2018 | May 17, 2018 | R code (html) for discussion |
Series 9 | May 18, 2018 | May 24, 2018 | Subdifferentials |
Series 10 (revised on 31.05.18) | May 25, 2018 | May 31, 2018 | |
General discussion | June 1, 2018 |
Solutions
The solutions will be sent weekly to the students enrolled for this course.
Links
- R homepage
- R studio homepage
- Getting help with R
- Try R Code School (interactive tutorial; no installation of R is required)
Literature
Main literature
- P. Bühlmann, M. Mächler. Script Computational Statistics. (Version of October 12, 2016).
- G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning: with Applications in R [ISLR]. Springer.
- T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning [ESL]. Springer.
Other relevant literature
- C. R. Shalizi. Advanced Data Analysis from an Elementary Point of View.
- B. Efron, T. Hastie. Computer Age Statistical Inference. Cambridge University Press.
- J. E. Gentle. Elements of Computational Statistics. Springer.
- W. N. Venables, B. D. Ripley. Modern Applied Statistics with S. Springer.
- L. Wasserman. All of statistics. Springer.