Applied Multivariate Statistics

Spring semester 2021

General information

Lecturer Fabio Sigrist
Assistant Domagoj Ćevid
Lectures Online
Exercises Online
Course catalogue data >>

Important announcement

This website contains only basic info about the course. All materials will be added to the course Moodle. The forums where you can ask questions can be found there.

Course content

Multivariate statistics analyzes data on several random variables simultaneously. This course introduces the basic concepts and provides an overview of classical and modern methods of multivariate statistics including visualization, dimension reduction, supervised and unsupervised learning for multivariate data. An emphasis is on applications and solving problems with the statistical software R.

Literature

  • "An Introduction to Applied Multivariate Analysis with R" (2011) by Everitt and Hothorn
  • "An Introduction to Statistical Learning: With Applications in R" (2013) by Gareth, Witten, Hastie and Tibshirani
  • "Introductory Statistics with R" (2008) by Dalgaard

Additional information

Examples in the lecture and hints as well as solutions to the exercises will be based on the statistical software R. This is a freely available open source program that works on all platforms and has become worldwide standard for data analysis. It can be downloaded from CRAN. A R Tutorial can be found here.

Exam

There will be a 120-minute written exam during the regular ETH exam sessions. The exam is a "closed book" exam, a simple pocket calculator with no communication capability is permitted. The exam covers all topics which were discussed and/or applied during either the lectures or the exercises. Upon passing the exam, the course will be awarded 5 ECTS credit points.

PhD students who would like to obtain credit points but do not need to take the exam and obtain a grade need to sign up with the lecturer at the beginning of the semester and hand in 5 well-solved exercises.

Announcements

  • February 21st, 2021:
    In accordance to the epidemiological measures, this course will be offered exclusively online until further notice. All contents can be found in the course Moodle.
  • February 21st, 2021:
    Beginning of lectures: Monday, 22.02.2021 (Exercises start on 08.03.2021).

Course materials

The schedule is subject to minor modifications. The slides will be uploaded before each lecture.

Week Date Topic
Week 1 22.02.2021 Introduction & visualization
Week 2 01.03.2021 Visualization & outliers
Week 3 08.03.2021 Principal component analysis 1
Week 4 15.03.2021 Principal component analysis 2
Week 5 22.03.2021 Multidimensional scaling
Week 6 29.03.2021 Multidimensional scaling continued / Cluster analysis
Week 7 12.04.2021 Cluster analysis continued
Week 8 26.04.2021 Factor Analysis
Week 9 03.05.2021 Classification 1: discriminant analysis & logistic regression
Week 10 10.05.2021 Extending univariate methods
Week 11 17.05.2021 Classification 2: trees & random forest
Week 12 31.05.2021 Models for repeated measures data

Literature:

  • Everitt and Hothorn (2011), An Introduction to Applied Multivariate Analysis with R, Springer
  • James, Witten, Hastie, and Tibshirani (2013), An Introduction to Statistical Learning, Springer
  • P. Dalgaard (2008), Introductory Statistics with R, Springer

Exercise classes

Exercise classes are held online approximately every second week starting from 08/03/2021. Please install R and RStudio for the exercises. The first part of the exercise classes will be concerned with the course content and the exercises. The second part will be Q&A session where the students can ask any questions regarding the course contents.

Series and solutions

There is no testat requirement for students who take the exam. PhD students who do not take the exam but would like to obtain a testat should send their solutions to the assistants by email no later than one week after the discussion in the exercise class.

Exercises Solutions Date of exercise class R code
Series 1 Solutions 1 08.03.2021
Series 2 Solutions 2 22.03.2021
Series 3 Solutions 3 12.04.2021
Series 4 Solutions 4 26.04.2021
Series 5 Solutions 5 10.05.2021
Series 6 Solutions 6 31.05.2021