A PROBABILISTIC ANALYSIS OF BIAS OPTIMALITY IN UNICHAIN MARKOV DECISION PROCESSES


Mark E. Lewis
Industrial and Operations Engineering
University of Michigan
1205 Beal Avenue
Ann Arbor, MI 48109-2117

Martin L. Puterman
Faculty of Commerce and Business Administration
The University of British Columbia
2053 Main Mall
Vancouver, BC V6T 1Z2 Canada

This paper focuses on bias optimality in unichain, finite state and action space Markov Decision Processes. Using relative value functions, we present new methods for evaluating optimal bias. This leads to a probabilistic analysis which transforms the original reward problem into a minimum average cost problem. The result is an explanation of how and why bias implicitly discounts future rewards.