International Journal of Computational
Intelligence Research (IJCIR)
Volume 3, Number 1 (2007)
Reinforcement learning approaches for constrained MDPs
Institute of Cognitive Science, AI Group University of Osnabrück, Germany
Most reinforcement learning approaches consider Markov decision processes (MDPs) with a single criterion. In practical applications, however, we often have to deal with additional criteria, e.g. the energy consumed or the time spent during solving the main task. In this article, we will therefore consider Markov Decision Processes with two criteria. Each criterion is defined as the expected value of a cumulative return. The second criterion is subject to an inequality constraint. We will describe two new reinforcement learning approaches for solving such control problems, discuss their advantages and shortcomings, and present experimental results based on randomly generated MDPs.
Machine Learning, Reinforcement Learning, Dynamic Programming, Constraints.