Multivariate statistics analyzes data on several random variables simultaneously. This course introduces the basic concepts and provides an overview of classical and modern methods of multivariate statistics including visualization, dimension reduction, supervised and unsupervised learning for multivariate data. An emphasis is on applications and solving problems with the statistical software R.
- "An Introduction to Applied Multivariate Analysis with R" (2011) by Everitt and Hothorn
- "An Introduction to Statistical Learning: With Applications in R" (2013) by Gareth, Witten, Hastie and Tibshirani
- "Introductory Statistics with R" (2008) by Dalgaard
Examples in the lecture and hints as well as solutions to the exercises will be based on the statistical software R. This is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. A R Tutorial can be found here.
There will be a 120-minute written exam during the regular ETH exam sessions. The exam is a "closed book" exam, a simple pocket calculator with no communication capability is permitted. The exam covers all topics which were discussed and/or applied during either the lectures or the exercises. Upon passing the exam, the course will be awarded 5 ECTS credit points.
PhD students who would like to obtain credit points but do not need to take the exam and obtain a grade need to sign up with the lecturer at the beginning of the semester and hand in 5 well-solved exercises.
February 11th, 2019:
Beginning of lecture: Monday, 18/02/2019 (Exercises start on 04/03/2019).
April 3rd, 2019:
The 4th exercise class, originally scheduled for 15th of April 2019 will take place on Friday, 12th of April from 11am to 1pm in HG D 7.1.
May 23rd, 2019:
The last (6th) exercise class, scheduled for 23rd of May 2019 will take place at 15.00 instead of 8.00 in HG F3.
The schedule is subject to minor modifications. The slides will be uploaded before each lecture.
|Week 1||18.02.2019.||Introduction & visualization
|Week 2||25.02.2019.||Visualization & outliers
|Week 3||04.03.2019.||Principal component analysis 1
|Week 4||11.03.2019.||Principal component analysis 2
|Week 5||18.03.2019.||Multidimensional scaling
|Week 6||25.03.2019||Multidimensional scaling cont. / cluster analysis
|Week 7||01.04.2019.||Cluster analysis cont.
|Week 8||15.04.2019.||Factor Analysis
|Week 9||29.04.2019.||Classification 1: discriminant analysis & logistic regression
|Week 10||06.05.2019.||Extending univariate methods
|Week 11||13.05.2019.||Classification 2: trees & random forest
|Week 12||20.05.2019.||Models for repeated measures data
- Everitt and Hothorn (2011), An Introduction to Applied Multivariate Analysis with R, Springer
- James, Witten, Hastie, and Tibshirani (2013), An Introduction to Statistical Learning, Springer
- P. Dalgaard (2008), Introductory Statistics with R, Springer
Exercise classes are held approx. every second week starting from 04/03/2019 in HG D 1.1. Please install R and RStudio and bring your laptop to the exercise classes, if possible.
Series and solutions
There is no testat requirement for students who take the exam. PhD students who do not take the exam but would like to obtain a testat should send their solutions to the assistants by email no later than one week after the discussion in the exercise class.
|Exercises||Solutions||Date of exercise class|
|Series 1||Solutions 1||04.03.2019.|
|Series 2||Solutions 2||18.03.2019.|
|Series 3||Solutions 3||01.04.2019.|
|Series 4||Solutions 4||12.04.2019.|
|Series 5||Solutions 5||06.05.2019.|
|Series 6||Solutions 6||27.05.2019. (15.00)|