[Statlist] Reminder: ETH Young Data Science Researcher Seminar Zurich - hybrid seminar by Denny Wu, University of Toronto, 16 June 2022

Maurer Letizia |et|z|@m@urer @end|ng |rom ethz@ch
Tue Jun 14 06:46:13 CEST 2022


We are glad to announce the following talk in the ETH Young Data Science Researcher Seminar Zurich

"High-​dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation“  
by Denny Wu, University of Toronto

Time: Thursday, 16 June 2022, 15.00 - 16.00 CEST
Place: ETH Zurich, HG G 19.1 and Zoom at https://ethz.zoom.us/j/62895316484

Abstract: We study the first gradient descent step on the first-​layer weights W in a two-​layer neural network, where the parameters are randomly initialized, and the training objective is the empirical MSE loss. In the proportional asymptotic limit (where the training set size n, the number of input features d, and the width of the neural network N all diverge at the same rate), and under an idealized student-​teacher setting, we show that the first gradient update contains a rank-​1 "spike", which results in an alignment between the first-​layer weights and the linear component of the teacher model f*. To characterize the impact of this alignment, we compute the prediction risk of ridge regression on the conjugate kernel after one gradient step on W with learning rate \eta. We consider two scalings of the first step learning rate \eta. For small \eta, we establish a Gaussian equivalence property for the trained feature map, and prove that the learned kernel improves upon the initial random feature model, but cannot defeat the best linear model on the input. Whereas for sufficiently large \eta, we prove that for certain f^*, the same ridge estimator on trained features can go beyond this "linear regime" and outperform a wide range of (fixed) kernels. Our results demonstrate that even one gradient step can lead to a considerable advantage over random features, and highlight the role of learning rate scaling in the initial phase of training.

This talk will take place in a hybrid format. You are very much welcome to listen to the lecture in person at ETH Zurich, HG G 19.1 as well as to attend remotely.


M. Azadkia, G. Chinot, J. Hörrmann, M. Löffler, A. Taeb, N. Zhivotovskiy


Seminar website: https://math.ethz.ch/sfs/news-and-events/young-data-science.html

Young Data Science Researcher Seminar Zurich – Seminar for Statistics | ETH Zurich
math.ethz.ch




More information about the Statlist mailing list