Les thèmes abordés incluent
Vendredi 29 Mai, scéance commune avec le GdR IASIS, à l'IHP en amphi Hermite .
Abstract: In this mini-tutorial, I will review some recent results on the analysis of non-linear models motivated from machine learning using tools from random matrix theory. Starting from the analysis of the two-layer neural networks at initialisation (a.k.a. random features model), we will discuss the notion of Gaussian universality, which allows to effectively treat non-linear functions of random matrix with standard tools. We will then discuss the problem of feature learning, when the network weights are trained and develop correlations with the data, and how this it can be treated with ideas that generalise universality. This will allow us to show the advantage of feature learning over kernel methods. Finally, I will discuss some of the more recent progress concerning the analysis of the spectrum of trained two-layer neural networks.
Abstract: In recent years, models from machine learning have motivated the study of nonlinear random matrices, that is, random matrices involving the entrywise application of a deterministic nonlinear function. In this talk, we will focus on matrices of the form YY* with Y = f(WX). Here, W and X are random rectangular matrices with i.i.d. centered entries, representing the weights and data in a two-layer feed-forward neural network, and f is a nonlinear activation function. This setting is commonly known as the random features model. When the entries of both the weights and the inputs are light-tailed, the asymptotic behavior of the eigenvalues is by now well understood and coincides with that of a simple Gaussian-equivalent model. In this talk, I will instead focus on the regime where the weights are heavy-tailed, based on recent joint work with Alice Guionnet. This regime is motivated by empirical observations in trained neural networks, where learned weights often exhibit strong correlations and heavy-tailed distributions. We will show that, in this context, the spectral behavior departs significantly from the light-tailed regime, leading to new spectral phenomena, with a richer combinatorial structure in the moment expansion.
Abstract: This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computational threshold. Relying on standard tools from the theory of large random matrices, we characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. These results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime. This is particularly important since it serves as an initialization of the higher-order orthogonal iteration (HOOI) scheme, whose convergence to the best low-multilinear-rank approximation depends entirely on its initialization. We give a sufficient condition for the convergence of HOOI and show that the number of iterations before convergence tends to 1 in the large-dimensional limit.
* Organisateurs 2024-2025.
* Vendredi 10 octobre, à l'IHP
* Vendredi 21 novembre, à l'IHP
* Vendredi 5 décembre, au CMLS à l'Ecole Polytechnique
* Vendredi 16 janvier, à l'IHP
* Vendredi 20 février, à Toulouse
* Vendredi 13 mars, à l'IHP
* Vendredi 10 avril, à l'IHP
* Vendredi 29 mai, Journée commune MEGA & IASIS** à l'IHP
* Vendredi 26 juin, à l'Ecole polytechnique
Le séminaire MEGA a été créé en 2014 par Djalil Chafaï et Camille Male avec l'aide de Florent Benaych-Georges.
Image est tirée de https://www.mat.tuhh.de/forschung/aa/forschung.html.