Séminaire des nouveaux statisticiens du CEREMADE

Mercredi 23 janvier 2013, salle AR52-53 (aile A, 5ème étage, au milieu).
Organisé par Vincent Rivoirard et Robin Ryder.

Planning

Titres et résumés

13 h 30, Jean-Bernard Salomond : Consistency and concentration rate of the posterior under monotonicity constraints
We consider the well known problem of estimating a density function under qualitative assumptions. More precisely, we estimate monotone non increasing densities in a Bayesian setting and derive convergence rate for the posterior distribution for a Dirichlet process and finite mixture prior. We prove that the posterior distribution based on both priors concentrates at the rate $(n/\log(n))^{-1/3}$, which is the minimax rate of estimation up to a $\log(n)$ factor. We also study the behaviour of the posterior for the point-wise loss at any fixed point of the support the density and for the sup norm. We prove that the posterior is consistent and achieve the optimal concentration rate for both losses.

13 h 55, Sofia Tsepletidou : Computational Bayesian Tools for Modeling the Aging Process
Whereas the aging process is obvious in macroscopic organisms, it is not in single- celled ones. However, when monitoring the growth of rod-shaped bacterial colonies, for instance using the model organism E. coli, it is made possible to recognize an ag- ing mechanism. This is due to the division process, which splits the cell transversally producing a new end per progeny cell; this new end is called new pole, whereas the other pre-existing end, old pole. Thus, the replicative age is defined as the number of generations elapsed since the old pole arose. The older this pole, the slower is its growth; thus, more damages are expected to have accumulated - increased physiologi- cal age. However, the replicative age accounts for a significant, yet limited fraction of the variability observed in the physiological characteristics. Understanding the impact of the replicative age on the physiological measurements, as well as, the mechanism with which the cells are rejuvenated, symmetrical or not, is possible by reconstructing a hidden quantity that would govern the physiology of the cell while fulfilling basic con- servation laws. Estimation is made in form of exploration of the approximate posterior distribution for the parameters of the constructed mathematical model. Approximate Bayesian Computation methods (ABC rejection sampler and ABC MCMC sampler) are considered in order to avoid the combinatorial cost, as well as, the difficulty of computing the distribution of the statistics which this study is relied on. Results show that the method recognizes well the presence and the absence of asymmetry, but not at a low level.

14 h 20, Marc Hoffmann : Inférence statistique pour une population structurée via un mécanisme de transport-fragmentation
We investigate inference in simple models that decribe the evolution (in size or age) a a population of bacteria across scales. The size of the system evolves according to a fragmentation-transport equation: each individual grows with a given transport rate, and splits into two offsprings, according to a binary fragmentation process with unknown division rate that depends on its size. Macroscopically, the system is well approximated by a PDE and statistical inference transfers into a nonlinear inverse problem. Microscopically, a more accurate description is given by a stochastic piecewise deterministic Markov process, which allows for other methods of inference, introducing however stochastic dependence. We will discuss and present some new results on the inference of the division rate. Real data analysis is conducted on E. Coli experiments. This is a joint work with M. Doumic (INRIA and Paris 6), N. Krell (Rennes 1) and L. Robert (INRA).

15 h 30, Bartek Knapik : Recovery of linear functionals in the infinite-dimensional normal mean model
We consider a generalization of the infinite-dimensional normal mean model in which the mean is related to the parameter of interest through a known injective linear transformation. Our interest lies in the recovery of linear functionals of the parameter. We employ Bayesian paradigm by putting a Gaussian prior on the parameter of interest and show that the corresponding posterior distribution contracts around the true value of the linear functional at a rate that depends on the smoothness of the truth, the smoothness and scale of the prior, spectral properties of the transformation of the mean, and the smoothness of the linear functional. We consider not only continuous linear functionals, but also certain class of discontinuous functionals (containing for instance point evaluation when the parameter of interest is a function). We also study the frequentist coverage of Bayesian credible intervals, including a semiparametric version of the Bernstein-von Mises theorem. This talk is based on a joint work with Aad van der Vaart (Leiden University) and Harry van Zanten (University of Amsterdam)

15 h 55, Laure Sansonnet : Estimation non-paramétrique dans un modèle d'interactions poissoniennes et application à des données génomiques
L'objet de cet exposé est de présenter une approche statistique pour étudier les dépendances entre deux événements modélisés par des processus ponctuels. On s'intéressera en particulier au domaine de la génomique afin de détecter des distances favorisées ou évitées entre deux motifs le long d'un génome suggérant de possibles interactions à un niveau moléculaire. Pour cela, on introduira une fonction dite de reproduction qui permet de quantifier les positions préférentielles des motifs et qui est modélisée par l'intensité d'un processus de Poisson. On s'intéressera d'abord à l'estimation de cette fonction que l'on suppose très localisée. A l'aide des bases d'ondelettes (en pratique, la base de Haar) et des techniques de seuillage, on construira un estimateur adaptatif qui satisfait une inégalité de type oracle. On présentera ensuite brièvement la procédure pratique mise en place et enfin, on appliquera la méthode à l'analyse de la dépendance entre les sites promoteurs et les gènes chez la bactérie E. coli en s'appuyant sur un jeu de données réelles.

16 h 20, Samuel Vaiter : The Degrees of Freedom of the Group Lasso for a General Design
In this paper, we are concerned with regression problems where covariates can be grouped in nonoverlapping blocks, and where only a few of them are assumed to be active. In such a situation, the group Lasso is an at- tractive method for variable selection since it promotes sparsity of the groups. We study the sensitivity of any group Lasso solution to the observations and provide its precise local parameterization. When the noise is Gaussian, this allows us to derive an unbiased estimator of the degrees of freedom of the group Lasso. This result holds true for any fixed design, no matter whether it is under- or overdetermined. With these results at hand, various model selec- tion criteria, such as the Stein Unbiased Risk Estimator (SURE), are readily available which can provide an objectively guided choice of the optimal group Lasso fit.