Academia.eduAcademia.edu

Outline

Reduction of temporal complexity in Markov decision processes

2012, CONIELECOMP 2012, 22nd International Conference on Electrical Communications and Computers

Abstract

In this paper we present a new approach for the solution of Markov decision processes based on the use of an abstraction technique over the action space, which results in a set of abstract actions. Markovian processes have successfully solved many probabilistic problems such as: process control, decision analysis and economy. But for problems with continuous or high dimensionality domains, high computational complexity arises because the search space grows exponentially with the number of variables. In order to reduce computational complexity, our approach avoids the use of the whole domain actions during value iteration, calculating instead over the abstract actions that really operate on each state, as a state function. Our experimental results on a robot path planning task show an important reduction of computational complexity.

References (20)

  1. R.E. Bellman, Dynamic Programming, Princeton United Press, Princeton, New Jersey, USA, 1957.
  2. S.S. Benson, "Learning action models for reactive autonomous agents", PhD Thesis, University of Stanford, USA, 1996.
  3. Bertsekas, Dynamic Programming, Prentice Hall, Eaglewood Cliffs, NJ, USA, 1987.
  4. B. Bonet, J. Pearl, "Qualitative MDP and POMDP: An order-of-magnitude approach", Proceedings of the 18th Conference on Uncertainty in AI, Edmonton, Canada, 2002, pp. 61-68.
  5. B. Bonet, H. Geffner, "Learning depth-first search: A unified approach to heuristic search in deterministic and non- deterministic settings and its application to MDP", Proceedings of ICAPS, 2006.
  6. C. Boutilier, T. Dean, S. Hanks, "Decision-theoretic planning: structural assumptions and computational leverage", Journal of AI Research, 11, 1999, pp. 1-94.
  7. C. Boutilier, R. Dearden, M. Goldszmidt, "Stochastic Dynamic Programming with factored representations", Artificial Intelligence, 121 (1-2), 2000, pp. 49-107.
  8. R. Brafman, "A heuristic variable grid solution method of POMDP", Proceedings of the 14 th National Conference on AI, pp. 727-733, Menlo Park, CA, USA, 1997.
  9. W. Chenggang, J. Saket, R. Khardon, "First-Order Decision Diagrams for Relational Markov Decision Processes", Proceedings of IJCAI, 2007.
  10. A. Darwiche, M. Goldszmidt, "Action networks: A framework for reasoning about actions and change under understanding", Proceedings of the Tenth Conference on Uncertainty in AI, Seattle, WA, USA,1994 pp. 136-144.
  11. D.P. De Farias, B. Van Roy, "The Linear Programming approach to approximate Dynamic Programming", Operations Research, vol. 51(6), 2003, pp. 850-856.
  12. T. Dean, K. Kanazawa, "A model for reasoning about persistence and causation", Computational Intelligence, vol. 5, 1989, pp. 142-150.
  13. M. Hauskrecht, B. Kveton, "Linear program approximations for factored continuous-states Markov Decision Processes", Proceedings of NIPS, 2003.
  14. R.A. Howard, Dynamic Programming and Markov Processes. MIT Press, Cambridge, MA, USA, 1960.
  15. C.A. Knoblock, "Automatically generating abstractions for problem solving", PhD thesis, School of Computer Science, Carnegie Mellon University, USA, 1991.
  16. E. Morales, "Scaling up reinforcement learning with a Relational Representation", Proceedings of the Workshop on Adaptability in Multi Agent Systems, Sydney, Australia, 2003.
  17. M.L. Puterman, Markov Decision Processes, Wiley Ed., New York, USA, 1994.
  18. M.L. Puterman, Markov Decision Processes: A survey, UBC Commerce, Canada, 2002.
  19. A. Reyes, P. Ibarguengoytia, L.E. Sucar, E. Morales, "Abstraction and Refinement for Solving Continuous Markov Decision Processes", Proc. of the 3rd European Workshop on Probabilistic Graphical Models, Czech Republic, 2006, pp. 263-270.
  20. M. Van Otterlo, K. Kersting, L. De Raedt, "Bellman goes Relational", Proceedings of the 21st International Conference on Machine Learning, Banff, Canada, 2004.