Cahiers du CEREMADE 

Unité
Mixte de Recherche du C.N.R.S. N°7534 

Abstract : An observer of a process (Xt) believes the process is
governed by Q whereas the true law is P. We bound the expected
average L1 distance between the stageconditional distributions of P and Q by a function of
the relative entropy between the marginals of P and Q on the n
first realizations. We apply this bound to the cost of learning in
sequential decision problems and to the merging of Q to P. 





200648 

24102006 

Université
de PARIS  DAUPHINE Place du Maréchal de Lattre De Tassigny  75775 PARIS CEDEX 16  FRANCE Téléphone : +33 (0)1 44054923  fax : +33 (0)1 44054599 