Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
mega:seminaire [2026/05/06 12:01] Raphaël BUTEZmega:seminaire [2026/05/06 14:27] (Version actuelle) Raphaël BUTEZ
Ligne 30: Ligne 30:
 Abstract: This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computational threshold. Relying on standard tools from the theory of large random matrices, we characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. These results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime. This is particularly important since it serves as an initialization of the higher-order orthogonal iteration (HOOI) scheme, whose convergence to the best low-multilinear-rank approximation depends entirely on its initialization. We give a sufficient condition for the convergence of HOOI and show that the number of iterations before convergence tends to 1 in the large-dimensional limit. Abstract: This work presents a comprehensive understanding of the estimation of a planted low-rank signal from a general spiked tensor model near the computational threshold. Relying on standard tools from the theory of large random matrices, we characterize the large-dimensional spectral behavior of the unfoldings of the data tensor and exhibit relevant signal-to-noise ratios governing the detectability of the principal directions of the signal. These results allow to accurately predict the reconstruction performance of truncated multilinear SVD (MLSVD) in the non-trivial regime. This is particularly important since it serves as an initialization of the higher-order orthogonal iteration (HOOI) scheme, whose convergence to the best low-multilinear-rank approximation depends entirely on its initialization. We give a sufficient condition for the convergence of HOOI and show that the number of iterations before convergence tends to 1 in the large-dimensional limit.
  
-    * 16h00-16h30: Exposé court ouvert à contribution (date limite 01/05/2026) +    * 16h00-16h30: Exposé de **[[https://arxiv.org/search/stat?searchtype=author&query=Morisset,+L|Lucas Morisset]]** // Characterizing the Generalization Error of Random Feature Regression with Arbitrary Data-Augmentation.// \\ 
-    * 16h30-17h00: Exposé court ouvert à contribution (date limite 01/05/2026)+Abstract: In this work, we aim to characterize the effect that modern data augmentation schemes have on the generalization error of large deep learning models.  While data augmentation is now a standard ingredient of modern machine learning pipelines, its theoretical understanding remains limited. As a tractable testbed, we study random feature regression, which has recently attracted significant interest because it captures several qualitative and quantitative properties of large neural networks. By leveraging tools from random matrix theory, we derive deterministic equivalents for the generalization error under broad augmentation schemes, including settings where the augmented samples are strongly dependent. This allows us to quantify explicitly how data augmentation reshapes the bias-variance trade-off, and to identify regimes in which it improves performance as well as regimes in which it can be detrimental. On the technical side, we develop anisotropic deterministic equivalents for key quantities of the augmented problem, including the resolvent of the sample covariance matrix of the augmented data and the trained linear readout of random feature regression. More broadly, our results provide, to the best of our knowledge, the first sharp precise asymptotic result of generalization in the presence of strong sample dependence. 
 +    * 16h30-17h00: Exposé de **[[https://andrea-combette.com/|Andrea Combette]]** // Initialization at Criticality to Control Property Propagation in Neural Networks.// \\ 
 +Abstract:  
 +Understanding how information propagates in very deep neural networks is essential for designing architectures that remain trainable as depth increases. In our previous work, A New Initialisation to Control Gradients in Sinusoidal Neural Networks (ICLR 2026), we introduced an initialisation scheme for SIREN networks that controls both gradient variance and the Fourier spectrum of the network output. This led us to study the large-depth limit of neural networks more generally and to address the central question of this work: how do correlations, gradients, and the Neural Tangent Kernel spectrum propagate through depth? 
 +To answer this question, we develop a theoretical framework in the thermodynamic sequential limit, corresponding to the infinite-width mean-field regime. This framework provides a unified description of these propagation mechanisms by combining mean-field analysis with tools from free probability, following the approach introduced by Pennington et al. [1]. 
 +This framework allowed us to identify an initialisation strategy based on orthogonal weights that applies to a broad class of activation functions. At this initialisation, the network operates at criticality: signal propagation remains non-trivial even at large depth, the first two moments of the NTK spectrum remain stable and gradients decay algebraically as opposed to previous statements. These results provide a principled way to design deep networks whose signal, gradient, and kernel properties are jointly explained by this work. 
 + 
 +Reference: 
 +[1] J. Pennington, S. Schoenholz, and S. Ganguli. “Resurrecting the Sigmoid in Deep Learning through Dynamical Isometry.” 
  
  
  • mega/seminaire.txt
  • Dernière modification : 2026/05/06 14:27
  • de Raphaël BUTEZ