Computational Statistics
Spring semester 2018
General information
Lecturer  Marloes Maathuis 

Assistants  Peter Hinz, Loris Michel, Claude Renaux 
Lectures  Thu 1315
HG
F 3
>>
Fri 0910 HG G 3 >> 
Exercises  Fri 1012 HG F 3 >> 
Course catalogue data  >> 
Credit points  In order to obtain ECTS credit points, you need to pass the official exam. In order to obtain ETH credit points (this only applies to doctoral students from certain departments), you need to solve and submit at least 70% of the exercises. 
Course content
We will study modern statistical methods for data analysis, including their algorithmic aspects and theoretical properties. Special emphasis will be placed on resampling based methods for inference. The course is handson, and methods are applied using the statistical programming language R.
The main text for the course is the following script. We will also use material from books (see Literature). The material for each week will be indicated under the Course material tab.
The exam will cover all material that is discussed during the lectures and the exercise sessions. If you miss a class, please make sure to copy class notes from someone else.
Announcements

First week
Beginning of lecture: Thursday, 22/02/2018.Exercises start on 23/02/2018 at 10:15 in room HG F3, with a special introduction to the R software. Please bring a laptop. See the Exercises tab and further helpful links related to R.

Special dates:
 Different room: The class on March 29th takes place in HG E 3.
 Holidays: There will be no class on March 30th, April 5th, April 6th and May 10th.
 Cancellation: The class and exercises on May 11th are cancelled.
Course material
Week  Topic 

Week 1  Linear regression 
Week 2 
Linear regression

Week 3  Confidence intervals, and biasvariance tradeoff 
Week 4  KNN, Cross validation 
Week 5  Bootstrap

Week 6  Bootstrap

Week 7  no class 
Week 8  Bootstrap and Monte Carlo tests 
Week 9  Permutation tests 
Week 10  Multiple testing and model selection 
Week 11  Inference after model selection, moving beyond linearity 
Week 12  no class 
Week 13  Beyond linearity 
Week 14  Treebased methods 
Week 15  Bagging and boosting 
Exercise classes
The exercises form a very important part of the course. They are also important for exam preparation, since part of the exam requires the use of R. If possible, please always bring a laptop to the exercise classes. After a brief introduction by one of the assistants, you can work on the exercises yourself or with others, and ask any questions you may have (about the exercises or the course material). There will be several assistants available to answer your questions.
Unless you are a PhD student who requires ETH credit points, you do not have to hand in your exercises. We will provide solutions, and you are expected to check your own work. Please ask if anything is unclear, for example if you don't understand the solutions, or if you found a different solution.
PhD students who require ETH credit points should place the solved exercises in the corresponding tray in HG J68 on the due date by noon at the latest.
Exercise sheets
The new exercise sheet will be uploaded on Tuesday preceding the corresponding session.
Exercises  Discussion  Deadline  Slides / Notes 

February 23, 2018  RTutorial, R Code  
Series 1  March 2, 2018  March 9, 2018  R Code for discussion 
Series 2 (updated March 19)  March 9, 2018  March 15, 2018  R Code for discussion 
Series 3 (updated March 26)  March 16, 2018  March 22, 2018  R Code for discussion (Shalizi, Appendix N) 
Series 4 (updated March 26)  March 23, 2018  March 29, 2018  R Code for discussion 
Series 5, Rskeleton  April 13, 2018  April 19, 2018  R Code for discussion 
Series 6  April 20, 2018  April 26, 2018  R Code for discussion 
Series 7 (updated May 18), data  April 27, 2018  May 3, 2018  R Code for discussion 
Series 8  May 4, 2018  May 17, 2018  R code (html) for discussion 
Series 9  May 18, 2018  May 24, 2018  Subdifferentials 
Series 10 (revised on 31.05.18)  May 25, 2018  May 31, 2018  
General discussion  June 1, 2018 
Solutions
The solutions will be sent weekly to the students enrolled for this course.
Links
 R homepage
 R studio homepage
 Getting help with R
 Try R Code School (interactive tutorial; no installation of R is required)
Literature
Main literature
 P. Bühlmann, M. Mächler. Script Computational Statistics. (Version of October 12, 2016).
 G. James, D. Witten, T. Hastie, R. Tibshirani. An Introduction to Statistical Learning: with Applications in R [ISLR]. Springer.
 T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning [ESL]. Springer.
Other relevant literature
 C. R. Shalizi. Advanced Data Analysis from an Elementary Point of View.
 B. Efron, T. Hastie. Computer Age Statistical Inference. Cambridge University Press.
 J. E. Gentle. Elements of Computational Statistics. Springer.
 W. N. Venables, B. D. Ripley. Modern Applied Statistics with S. Springer.
 L. Wasserman. All of statistics. Springer.