Abstract : An observer of a process (Xt) believes the process is
governed by Q whereas the true law is P. We bound the expected
average L1 distance between the stageconditional distributions of P and Q by a function of
the relative entropy between the marginals of P and Q on the n
first realizations. We apply this bound to the cost of learning in
sequential decision problems and to the merging of Q to P. 





