Abstract : An observer of a process (Xt) believes the process is governed by Q whereas the true law is P. We bound the expected average L1 distance between the stage-conditional distributions of P and Q by a function of the relative entropy between the marginals of P and Q on the n first realizations. We apply this bound to the cost of learning in sequential decision problems and to the merging of Q to P.
Entropy Bounds on Bayesian Learning
GOSSNER Olivier, TOMALA Tristan
