Regression
Spring semester 2018
General information
Course content
In regression analysis, we examine the relationship between a random response variable and several other explanatory variables. In this class, we consider the theory of linear regression with one or more explanatory variables. Moreover, we also study robust methods, generalized linear models, model choice, highdimensional linear models, nonlinear models and nonparametric methods. Several numerical examples will illustrate the theory. You will learn to perform a regression analysis and interpret the results correctly. We will use the statistical software R to get handson experience with this. You will also learn to interpret and critique regression analyses done by others.
Literature
 Practical regression with R by Julian R. Faraway (2002) with Rcode.
 Peter Bühlmann and Sara van de Geer (2011), "Statistics for HighDimensional Data  Methods, Theory and Applications", Springer. (Available here for free when logged in via ETH; For highdimensional regression.)
 John Fox (1997), "Applied Regression Analysis, Linear Models, and Related Methods", Sage Publications. (Intuitive examples, not very mathematical.)
 Sanford Weisberg (2005), "Applied Linear Regression", 3rd edition, Wiley. (Similar to the one by Fox, but shorter.)
 Paul D. Allison (1999), "Multiple linear regression, a primer", Thousand Oaks. (Brief, good for interpretations, not very mathematical.)
 Peter Dalgaard (2002), "Introductory Statistics with R", Springer. (Introduction based on the software R.)
 T. Hastie, R. Tibshirani, and J. Friedman (2009), "The Elements of Statistical Learning", 2nd edition, Springer.
Additional information
Examples in the lecture, as well as solutions to the exercises will be based on the statistical software R. This is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. An R Tutorial can be found here.
Announcements

February 10th, 2018:
The first exercise session is on February 21 and will be an introduction to the statistical programming language R with some exercises. Starting from the second exercise session (March 2), the exercise classes will take place every second Friday. In the exercise sessions, you can solve the R problems, the series and ask questions. You need to bring your own laptop for solving the R questions. Wednesdays there will be lectures every week and Fridays will alternate between lectures and exercise sessions (exceptions will be announced). Please check this course website regularly for announcements regarding the schedule. The first lecture will be on February 23. 
June 5th, 2018:
The course material for the lecture has been updated to inlcude some Rscripts and further notes. Similarly, all sample solutions for the exercises are online. 
June 5th, 2018:
Note the following important dates, as announced on the last exercise sheet.
Question hour / Ferienpräsenz:
Monday, August 20th, 2018, 15:00  16:00, HG G 19.2
Thursday, August 23rd, 2018, 15:00  16:00, HG G 19.2
Exam review / Prüfungseinsicht:
Monday, September 24th, 2018, 12:00  13:00, HG G19.1
Course materials
Text:
 Lecture notes can be found here (PDF).
 The book used for highdimensional regression is available here for free when logged in via ETH. Details: Peter Bühlmann and Sara van de Geer (2011), "Statistics for HighDimensional Data  Methods, Theory and Applications", Springer.
 Practical regression with R by Julian R. Faraway (2002) with Rcode.
RScripts, Outputs, and Slides:
 Highdimensional inference
 Robust methods (by Jonathan Taylor)
 R script (p.2328)
 boston.R
 brainsize.R
 kernelsmoothing.R
 leukemia_modelselection.R
 poissonregr.R
 riboflavinhighdim.R
Additional material:
Alternative texts:
 John Fox (1997), "Applied Regression Analysis, Linear Models, and Related Methods", Sage Publications. (Intuitive examples, not very mathematical.)
 Sanford Weisberg (2005), "Applied Linear Regression", 3rd edition, Wiley. (Similar as the one by Fox but shorter.)
 Paul D. Allison (1999), "Multiple linear regression, a primer", Thousand Oaks. (Brief, good for interpretations, not very mathematical.)
 Peter Dalgaard (2002), "Introductory Statistics with R", Springer. (Introduction based on the software R.)
 T. Hastie, R. Tibshirani, and J. Friedman (2009), "The Elements of Statistical Learning", 2nd edition, Springer.
Exercise classes
The first exercise class (Wednesday, February 21) will feature an R tutorial with some exercises. Please install R and RStudio and bring your laptop to the exercise classes. From the second exercise class on (March 2), exercise classes will take place every second Friday.
Series and solutions
Handing in solutions for the exercise series is not mandatory. In case you do wish to hand in solutions to the series, these should be handed in by 13:00 of the designated optional handin date. You can submit your solutions by placing them in the REGRESSION box in room HG J 68.
Exercises  Hand out  Optional hand in  Discussion  Solution  Slides/Notes/Remarks  

Series 6  May 18, 2018  May 30, 2018  May 25, 2018  Solutions 6  
Series 5  May 4, 2018  May 16, 2018  May 11, 2018  Solutions 5  
Series 4  April 20, 2018  May 2, 2018  April 27, 2018  Solutions 4  
Series 3  April 6, 2018  April 18, 2018  April 13, 2018  Solutions 3  
Series 2  March 9, 2018  March 21, 2018  March 16, 2018  Solutions 2  Remarks 2 Examining data notes Examining data Rcode Transformations notes Transformations Rcode 

Series 1; wdi dataset  February 23, 2018  March 7, 2018  March 2, 2018  Solutions 1  Remarks 1  
R Series      February 21, 2018  R Series Solution  R Intro Slides R Intro Democode easy dataset A short introduction to R 

Introduction 