[Statlist] Next talk of Foundations of Data Science Seminar with Manfred K. Warmuth, UC Santa Cruz and Google Zurich, Wednesday, March 27, 2019

Maurer Letizia |et|z|@m@urer @end|ng |rom ethz@ch
Fri Mar 15 17:38:14 CET 2019


ETH Foundations of Data Science



We are pleased to announce the following talk:


Organisers:

Proff.  - Bölcskei Helmut - Bühlmann Peter - Buhmann Joachim M. - Hofmann Thomas - Krause Andreas - Lapidoth Amos - Loeliger Hans-Andrea - Maathuis Marloes H. - Meinshausen Nicolai - Rätsch Gunnar - Van de Geer Sara
_______________________________________________________________________________________________________________________________________________________________________________________

with Manfred K. Warmuth, UC Santa Cruz and Google Zurich

Wednesday, March 27, 2019,
ETH Zurich, HG D  5.2
at 12:30
****************************************************************************


Title:

Reverse Iterative Volume Sampling for Linear Regression

Joint work with Michal Derezinski


Abstract:

Consider the following basic one dimensional linear regression problem. You are given a set of n points and are to predict the hidden response value for each point. The goal is to achieve total square loss close to the optimal linear predictor without seeing most of the hidden responses. We show that if you sample JUST ONE point from the set and request its response value, then the linear predictor fitting that point response pair has exactly twice the optimum total square loss. In d dimensions, sampling a subset of just d points and fitting their responses achieves total loss d+1 times the optimum. The key trick is to use a joint sampling technique called volume sampling for sampling a diverse subset of points. We show that the least squares solution obtained for the volume sampled subproblem is an unbiased estimator of the optimal solution based on all n responses. This unbiasedness is a desirable property that is not shared by other common subset selection techniques. Motivated by these basic properties, we develop a theoretical framework for studying volume sampling, resulting in a number of new matrix expectation equalities and statistical guarantees which are of importance not only to least squares regression but also to numerical linear algebra in general. Our methods also lead to a regularized variant of volume sampling, and we propose the first efficient algorithm for volume sampling which makes this technique a practical tool in the machine learning toolbox. Finally, we provide experimental evidence which confirms our theoretical findings.

	[[alternative HTML version deleted]]



More information about the Statlist mailing list