Cahiers du CEREMADE 

Unité
Mixte de Recherche du C.N.R.S. N°7534 

Abstract : The original Studentization was the
conversion of a sample mean departure into the familiar
$t$statistic plus the derivation of the corresponding Student
distributon function; the observed value of the distribution
function is the observed $p$value, as presented in an elemental
form. We examine this process in a broadly general context: a null
statistical model is available together with observed data; a
statistic $t(y)$ has been proposed as a plausible measure of the
location of data relative to what is expected under the null; a
modified statistic, say $ ilde t(y)$, is developed that is
ancillary; the corresponding distribution function is determined,
exactly or approximately; and the observed value of the
distribution function is the $p$value or percentile position of
the data with respect to the model.
Such $p$values have had extensive coverage in the recent Bayesian
literature with many variations and some preference for two
versions labelled $p_{ppost}$ and $p_{cpred}$. The bootstrap
method also directly addresses this Studentization process.
We use recent likelihood theory that gives a factorization of a
regular statistical model into a marginal density for a full
dimensional ancillary and a conditional density for the maximum
likelihood variable. The full dimensional ancillary is shown to
lead to an explicit determination of the Studentized version
$ ilde t(y)$ together with a highly accurate approximation to its
distribution function; the observed value of the distribution
function is the $p$value and its value as an integral is
available numerically by direct calculation or by Markov chain
Monte Carlo or other simulations.
Here, for any given initial trial or test statistic proposed as a
location indicator for a data point we develop: an ancillary based
$p$value designated $p_{
m anc}$; a special version of the
Bayesian $p_{
m cpred}$; and a bootstrap based $p$value
designated $p_{
m bs}$. We then show under moderate regularity
that these are equivalent to the third order and have uniqueness
as a determination of the statistical location of the data point,
as of course derived from the initial location measure. We also
show that these $p$values have a uniform distribution to third
order, as based on calculations in the moderatedeviations region.
For implementation the Bayesian and likelihood procedures would
perhaps require the same numerical computations while the
bootstrap would require a magnitude more in computation and would
perhaps not be accessible. Examples are given to indicate the ease
and flexibility of the approach.






200633 

29052006 

Université
de PARIS  DAUPHINE Place du Maréchal de Lattre De Tassigny  75775 PARIS CEDEX 16  FRANCE Téléphone : +33 (0)1 44054923  fax : +33 (0)1 44054599 