Kernel-Smoothed Score in Denoising Diffusions
Abstract
Denoising diffusion models are able to generate realistic and complex data (images, text, molecular structures, ...). However, we still lack a clear theoretical picture of when and why they generalise rather than memorise finite training sets. To probe this, we introduce a simplified model that replaces the empirical score with a mollified score obtained by convolution. This modification induces a two-fold regularisation: (i) an isotropic diffusion that blurs fine, sample-specific features, and (ii) a smoothing along the data manifold that preserves the global structure of the support.
Building on this idea, we propose the LED-KDE (Log-Exponential Double-Kernel Density Estimator), a density estimator that, unlike standard KDEs, reduces leakage of mass outside of the data manifold. Using a bias-variance decomposition tied to the mollification, we analyse the empirical score in the small-time, large dataset regime and derive upper bounds on the distance between the true data distribution and the distribution produced by the mollified denoising diffusion. These bounds quantify how mollification improves the generative process and clarify the mechanism by which denoising diffusion models can avoid memorisation and achieve better generalisation.
Based on joint work with Franck Gabriel, Maria Han Veiga and Emmanuel Schertzer