Student Seminar in Statistics:
Inference in Non-Classical Regression Models

Spring semester 2020

General information

Lecturer Fadoua Balabdaoui
Assistants Leonard Henckel, Yulia Kulagina
Lectures Mon 15.00-17.00 HG E 33.1
Course catalogue data VVZ

Course content

Abstract

The course encompasses a review of some non-standard regression models and the statistical properties of estimation methods in such models.

Objective

The students get to discover some less known regression models which either generalize well-known linear models (for example monotone regression) or violate some of the most fundamental assumptions (as in shuffled or unlinked regression models).

Content

Linear regression is one of the most widely-used models for prediction and hence one of the most understood in statistical literature. However, linearity might be too simplistic to capture the actual relationship between some response and the given covariates. Also, there are many real data problems where linearity is plausible but the actual pairing between the observed covariates and the responses is completely or partially lost. In this seminar, we review some of the non-classical regression models and the statistical properties of the estimation methods considered by well-known statisticians and machine learners. This will encompass:

  1. Monotone regression
  2. Single index model
  3. Unlinked regression
  4. Partially unlinked regression
  5. High-dimensional regression and sparsity

Literature

In the following is the material that will read and studied by each pair of students (all the items listed below are available through the ETH electronic library or arXiv):

  1. Chapter 2 from the book "Nonparametric estimation under shape constraints" by P. Groeneboom and G. Jongbloed, 2014, Cambridge University Press
  2. "Nonparametric shape-restricted regression" by A. Guntuoyina and B. Sen, 2018, Statistical Science, Volume 33, 568-594
  3. "Asymptotic distributions for two estimators of the single index model" by Y. Xia, 2006, Econometric Theory, Volume 22, 1112-1137
  4. "Least squares estimation in the monotone single index model" by F. Balabdaoui, C. Durot and H. K. Jankowski, Journal of Bernoulli, 2019, Volume 4B, 3276-3310
  5. "Least angle regression" by B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, 2004, Annals of Statsitics, Volume 32, 407-499.
  6. "Sharp thresholds for high dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)" by M. Wainwright, 2009, IEEE transactions in Information Theory, Volume 55, 1-19
  7. "Denoising linear models with permuted data" by A. Pananjady, M. Wainwright and T. A. Courtade and , 2017, IEEE International Symposium on Information Theory, 446-450.
  8. "Linear regression with shuffled data: statistical and computation limits of permutation recovery" by A. Pananjady, M. Wainwright and T. A. Courtade , 2018, IEEE transactions in Information Theory, Volume 64, 3286-3300
  9. "Linear regression without correspondence" by D. Hsu, K. Shi and X. Sun, 2017, NIPS
  10. "A pseudo-likelihood approach to linear regression with partially shuffled data" by M. Slawski, G. Diao, E. Ben-David, 2019, arXiv.
  11. "Uncoupled isotonic regression via minimum Wasserstein deconvolution" by P. Rigollet and J. Weed, 2019, Information and Inference, Volume 00, 1-27

Primary target group and Prerequisites

Mainly for students from the Mathematics Bachelor and Master Programmes who, in addition to the introductory course unit 401-2604-00L Probability and Statistics, have heard at least one core or elective course in statistics. Also offered in the Master Programmes Statistics resp. Data Science.

Announcements



    14.02.2020
    Welcome to the website of the course Student Seminar in Statistics: Inference in Non-Classical Regression Models!
    The first class will be an introductory lecture and will take place on Monday, 17.02.2020.
    We are looking forward to seeing you!

    Assignment of topics:
    The topics will be assigned during the first class, on Monday 17.02.2020. We will send out a Doodle poll beforehand, so that you could indicate your preferences. The first student presentation will take place on 24.02.2020.

    Please, let us know ASAP in case you decide not to take part in the seminar.



    30.03.2020
    Dear students,
    there is no talk today.
    We are looking forward to seeing you next week!

    The Zoom meeting ID will be e-mailed to you.


Course material and schedule

We will study the materials listed in the section Literature.

The registered students will be divided into pairs to work on the papers. Everyone is expected to participate actively during all lectures. Questions and discussions are strongly encouraged in this class!

The presentations should last roughly 2 x 25 minutes, with a 5-10 minute break in between. One of the assistants will meet with you twice before your presentation, to answer questions about the material and to give feedback on your planned presentation. More detailed guidelines for the presentations will be given during the first class. Please also see the FAQ for further details.


Week Topic Slides
Week 1 (17.02.2020) Introductory Lecture by Dr. Fadoua Balabdaoui
Week 2 (24.02.2020) Group 1: Basic Estimation Problems with Monotonicity Constraints

  • Speakers: Christian Holberg, Samuel Pullely
  • Assistants: Leo, Yulia
Week 3 (02.03.2020) Group 2: Nonparametric Shape-Restricted Regression

  • Speakers: Daria Izzo, Anna Maddux
  • Assistants: Leo, Yulia
Week 4 (09.03.2020) Group 3: Asymptotic Distributions for Two Estimators of the Single Index Model

  • Speakers: Christos Papadakis, Stefania Vasilaki
  • Assistants: Leo, Yulia
Week 5 (16.03.2020)
Week 6 (23.03.2020) Group 4: Least Squares Estimation in the Monotone Single Index Model

  • Speakers: Clemens Blab, Cyrill Scheidegger
  • Assistants: Leo, Yulia
Week 7 (30.03.2020) No class
Week 8 (06.04.2020) Group 5: Least Angle Regression

  • Speakers: Janosch Ott, Michael Zellinger
  • Assistants: Leo, Yulia
Week 9 (27.04.2020) Group 6: Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using ℓ1 -Constrained Quadratic Programming (Lasso)

  • Speakers: Daisuke Aldo Frei, Kaye Lisa Iseli
  • Assistants: Leo, Yulia
Week 10 (04.05.2020) Group 7: Denoising Linear Models with Permuted Data

  • Speakers: Felix Hafenmair, Judith Verena Hiesmayr
  • Assistants: Leo, Yulia
Week 11 (11.05.2020) Group 8: Linear Regression with Shuffled data: Statistical and Computation Limits of Permutation Recovery

  • Speakers: Gian Luca Cola, Arianna Francesca Piana
  • Assistants: Leo, Yulia
Week 12 (18.05.2020) Group 9: Linear Regression Without Correspondence

  • Speakers: Xinnuo Lin, Alexander Roebben
  • Assistants: Leo, Yulia
Week 13 (25.05.2020) Group 10: A Pseudo-Likelihood Approach to Linear Regression with Partially Shuffled Data

  • Speakers: Nikolaus Doppelbauer, Marc Nübel
  • Assistants: Leo, Yulia

FAQ

  1. How long should the presentation be?

    The total presentation time is 50 minutes. Each student should present roughly half of the time. We advise you to split the presentation in two parts of about 25 minutes each, with a 5-10 minute break in between. Please make sure to practice so that you don't go over your time! We highly encourage interaction and discussion with the audience, both during and after your talk. If this happens during your talk, this will not be counted as presentation time.

  2. Should I use a certain template for my slides?

    You can use any template you like. We recommend using one of the ETH presentation templates.

  3. How should the presentation be structured?

    The main purpose of the presentation is to transmit knowledge to the audience. So, after reading the material, please take a step back and try to put yourself in the shoes of the audience: What do they already know? What would they find most interesting? What would be helpful examples? We will also provide further guidelines for the presentations during the first lecture.

  4. Do I need to bring my own laptop to present my slides?

    Ideally, yes. If you do not have a laptop, or you do not have a way of connecting to the projector, please let the assistants know in advance.

  5. Will my slides be published somewhere?

    Yes, all slides will be published on the course website after the presentation. Please make sure to respect copyright. In particular, if you include any images or tables not created by yourself in the presentation, make sure to include the source of the image/table as well.

  6. What is the role of the assistants?

    The assistant in charge for your group gives you guidance and feedback prior to your presentation. You will have a chance to meet with the assistant twice before your presentation. The first meeting will be on Thursday, 1.5 weeks before your presentation (it will be Thursday by default but it is possible to reschedule the meeting on mutual agreement). The second meeting will typically take place on Thursday, 0.5 week before your presentation (again, rescheduling rule applies).

  7. How should I prepare for the meetings with the assistants?

    The first meeting: you should read all material in advance, make a list of questions you have, and make a rough plan of what you would like to present (main concepts, main examples, questions you could pose to the audience to create some interaction, R-example that you could integrate, etc). The second meeting: your presentation should be fully prepared and should be sent to the assistants the day before. During the meeting, you will get feedback on your presentation, and you can clarify any remaining issues.

  8. Do I have to attend all lectures?

    Yes, attendance at all lectures is compulsory. If you have to miss a class (due to illness or some force-major), please contact Dr. Fadoua Balabdaoui directly.