Academia.eduAcademia.edu

Outline

Cooperative Q-learning: the knowledge sharing issue

2001, Advanced Robotics

https://doi.org/10.1163/156855301317198142

Abstract

A group of cooperative and homogeneous Q-learning agents can cooperate to learn faster and gain more knowledge. In order to do so, each learner agent must be able to evaluate the expertness and the intelligence level of the other agents, and to assess the knowledge and the information it gets from them. In addition, the learner needs a suitable method to properly combine its own knowledge and what it gains from the other agents according to their relative expertness. In this paper, some expertness measuring criteria are introduced. Also, a new cooperative learning method called weighted strategy sharing (WSS) is introduced. In WSS, based on the amount of its teammate expertness, each agent assigns a weight to their knowledge and utilizes it accordingly. WSS and the expertness criteria are tested on two simulated hunter-prey and object-pushing systems.

References (19)

  1. M. Nili Ahmadabadi and E. Nakano, A 'constrain and move' approach to distributed object manipulation, in: IEEE Trans. Robotics and Automat. 17 (2), 157-172 (2001).
  2. P. Stone and M. M. Veloso, Multiagent systems: a survey from a machine learning perspective, Auton. Robot. 8 (3), 345-383 (2000).
  3. M. Tan, Multi-agent reinforcement learning: independent vs. cooperative agents, in: Machine Learning, Proc. 10th Int. Conf., Amherst, MA, pp. 330-337 (1993).
  4. A. Samuel, Some studies in machine learning using the game of checkers, Computer and Thought (1963).
  5. M. Dorigo and L. M. Gambardella, Ant colony system: a cooperative learning approach to the traveling salesman problem, IEEE Trans. Evolutionary Comput. 1, 53-66 (1997).
  6. Y. Kuniyoshi, M. Inaba and H. Inoue, Learning by watching: extracting reusable task knowledge from visual observation of human performance, IEEE Trans. Robotics Automat. 10 (6), 799-822 (1994).
  7. P. Bakker and Y. Kuniyoshi, Robot see, robot do: an overview of robot imitation, in: Proc. AISB Workshop on Learning in Robots and Animals, pp. 3-11 (1996).
  8. G. Hayes and J. Demiris, A robot controller using learning by imitation, in: Proc. 2nd Int. Symp. on Intelligent Robotic Systems, A. Borkowski and J. L. Crowley (Eds), pp. 198-204, LIFTA- IMAG, Grenoble, France (1994).
  9. T. Yamaguchi, Y. Tanaka and M. Yachida, Speed up reinforcement learning between two agents with adaptive mimetism, in: Proc. IEEE /RSJ Int. Conf. on Intelligent Robots and Systems, Grenoble, France, pp. 594-600 (1997).
  10. A. Garland and R. Alterman, Multiagent learning through collective memory, in: Proc. AAAISS '96, Stanford, Univ., CA, pp. 33-38 (1996).
  11. Y. Liu and X. Yao, A cooperative ensemble learning system, in: Proc. 1998 IEEE Int. Joint Conf. on Neural Networks (IJCNN '98), Anchorage, pp. 2202-2207 (1998).
  12. F. J. Provost and D. N. Hennessy, Scaling up: distributed machine learning with cooperation, in: Proc. AAAI '96, Menlo Park, CA, pp. 74-79 (1996).
  13. R. Maclin and J. W. Shavlic, Creating advice-taking reinforcement learners, Machine Learning 22, 251-282 (1996).
  14. H. Friedrich, M. Kaiser, O. Rogalla and R. Dillmann, Learning and communication in multi- agent systems, Distributed Arti cial Intelligence Meets Machine Learning, Lecture Notes in AI 1221 (1996).
  15. C. J. C. H. Watkins, Learning from delayed rewards, PhD Thesis, King's College (1989).
  16. C. J. C. H. Watkins and P. Dayan, Q-learning (technical note), Machine Learning: Special Issue on Reinforcement Learning 8, May (1992).
  17. N. Meuleau, "Simulating co-evolution with mimetism, in: Proc. First Eur. Conf. on Arti cial Life (ECAL-91), pp. 179-184. MIT Press, Cambridge (1991).
  18. E. Alpaydin, Techniques for combining multiple learners, in: Proc. of Engineering of Intelligent Systems '98 Conf., E. Alpaydin (Ed.), ICSC Press, Teneriffe, Spain, Vol. 2, pp. 6-12 (1998).
  19. C. Claus and C. Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, in: Proc. AAAI '97 Workshop on Multiagent Learning, Providence, pp. 13-18 (1997).