Lectures

Previous semesters

The websites of courses taught in previous semesters can be found here.

Question hours

In German: Ferienpräsenz

Important: This semester, the question hours are held on zoom. Please see the corresponding emails you received/will receive.
Lecture Date Time Room

Exam review

In German: Prüfungseinsicht

Important: Exam reviews are regulated by the official ETH Directive on "Viewing and transfer of performance assessment records" that can be found here. The English translation is for information purposes only. The German version is the legally binding version. It can be found here.

Statistik und Wahrscheinlichkeitsrechnung

Mathematik IV: Statistik

Fundamentals of Mathematical Statistics

Applied ANOVA and Experimental Design

Bachelor, master and semester thesis topics

Below you can find topics for bachelor, master or semester theses that the supervisors at the Seminar for Statistics offer.
Please note: This site is still under construction.

Peter Bühlmann

Contact: E-mail

Conformal prediction for anchor regression

Description: Conformal prediction leads to finite sample correct prediction intervals when the data are i.i.d. The goal is to study these methods and extend them to heterogeneous problems when using anchor regression and related techniques for domain adaptation.
Methods: Linear models, machine learning algorithms, stabilization
Knowledge: Statistical methods and modeling, programming in R or Python
Data: Mostly simulated, if interested larger ICU patient data
Literature:
https://www.tandfonline.com/doi/full/10.1080/01621459.2017.1307116?casa_token=xEbi9SO9uJ0AAAAA%3APhTQ-jYyhH9Ow_wI1DWsepy_PiwKtZ92TFy_tHDdZOIothxphTE_EPsJPXILkcj5YYZbkajD87ytB9M
https://projecteuclid.org/journals/annals-of-statistics/volume-49/issue-1/Predictive-inference-with-the-jackknife/10.1214/20-AOS1965.full
https://proceedings.neurips.cc/paper/2019/hash/8fb21ee7a2207526da55a679f0332de2-Abstract.html
https://www.pnas.org/doi/abs/10.1073/pnas.2107794118
https://rss.onlinelibrary.wiley.com/doi/10.1111/rssb.12398

Markus Kalisch

Contact: E-mail

Discrete Choice Models

Description: Discrete choice models or qualitative choice models are intended to explain choices between two or more discrete alternatives, such as buying a car or not or choosing among different occupations. In this project, you will read publications in the area, write a summary, apply and implement methods in R, perform simulation studies.
Methods: Extensions to linear regression motivated by economics and social sciences
Knowledge: Linear Regression

Ordinal Response Models

Description: In many applied settings the response variable is an ordinal variable, i.e. a variable whose value exists on an arbitrary scale where only the relative ordering between different values is significant. In this project, you will read publications in the area, write a summary, apply and implement methods in R, perform simulation studies.
Methods: Extensions to linear regression motivated by e.g. social sciences
Knowledge: Linear Regression

Generalized Additive Models

Description: A generalized additive model (GAM) is a generalized linear model in which the response variable depends linearly on unknown smooth functions of some predictor variables. In this project, you will read publications in the area, write a summary, apply and implement methods in R, perform simulation studies.
Methods: Extensions to linear regression motivated by many applied fields of research
Knowledge: Linear Regression

Model-Robustness in Linear Regression

Description: Linear Regression is a simple but surprisingly powerful tool in practical data analysis problems. In this thesis (SA/BA or MA) we have a closer look at the assumptions and optimality guarantees that come with the standard linear regression. Then, we will have a closer look at the robustness of the inference if these assumptions are violated and will research on methods which are more robust wrt. violations of the assumptions.
Methods: Extensions to linear regression motivated by many applied fields of research
Knowledge: Linear Regression

Lukas Meier

Contact: E-mail

Regression with Interval Censoring

Description: Read publications in the area, write a summary, apply and implement methods in R, perform simulation studies.
Methods: Special regression models motivated by survival analysis
Knowledge: Linear regression

Dyadic Regression Models

Description: Dyadic regression is used to model pairwise interaction data (between people, countries etc.), some models are also known as "gravity models". Read publications in the area, write a summary, apply and implement methods in R, perform simulation studies.
Methods: Regression
Knowledge: Linear regression

Nicolai Meinshausen

Contact: E-mail

Fairness in Machine Learning

Description: Read a few key publications in the area of fairness in Machine Learning and write a concise summary, highlighting key conceptual commonalities and differences
Methods: Linear regression and classification; tree ensembles; structural causal models
Knowledge: Regression and classification; causality
Data: some standard benchmark datasets can be used but can also be more theoretical

Invariant Risk Minimization

Description: Implement the invariant risk minimization framework of Arjovski (2019) and write a discussion
Methods: Linear models; tree ensembles; deep networks; causal inference
Knowledge: Machine Learning; Causality
Data: Datasets in paper or some other simple simulation data; possibly some larger datasets

Out-of-distribution generalizations

Description: Read some recent publications on out-of-distribution generalization and write a summary of their differences, advantages and drawbacks.
Methods: Linear models; tree ensembles; structural causal models
Knowledge: Regression and Classification; Causality
Data: Some small simulation studies; if of interest also larger datasets on ICU patient data

Quantile Treatment Effects

Description: Read on quantile treatment effects which characterize the possibly heterogenous causal effect and write a summary of current approaches
Methods: Linear models; tree ensembles; structural causal models; instrumental variables
Knowledge: Regression and Classification; Causality
Data: Can be theoretical; can also use some large-scale climate data

Christoph Schultheiss (with Peter Bühlmann)

Contact: E-mail

Goodness-of-fit test for detecting local causal structures

Description: The idea would be to evolve a goodness-of-fit method that aims to find out whether fitted regression models might be causal. We came up with a method for linear models, which can be shown to do asymptotically the right thing. This could be read up here. We would like to have a similar method for a broader class of regression models. In the "population case", where one knows the exact data distribution, this is rather straight forward. How to best implement this in practice with finite data where regression functions must be estimated, and afterwards statistical tests are needed is an open question. We have some ideas that could be tried in simulations, but new ideas are welcome as well. The project work would be mainly statistical methodology and simulations.
Methods: TBD
Data: Mostly simulated
Knowledge: Statistical methods, programming in R or Python.


Alexander Henzi (with Peter Bühlmann)

Contact: E-mail

Smooth isotonic distributional regression

Description: Isotonic distributional regression (IDR; https://doi.org/10.1111/rssb.12450, doi.org/10.1214/19-EJS1659) is a method for estimating the conditional distribution of an outcome given covariates under monotonicity constraints. The estimator produces discrete distributions, but often one would like to have an estimate of the conditional density. The goal of this project is to investigate methods for smoothing the IDR output distributions, based on a kernel density estimation approach.
Methods: kernel density estimation, shape restricted regression
Knowledge: basic knowledge of kernel density estimation, nonparametric statistics, R (or Python) programming