Academia.eduAcademia.edu

Outline

Intrinsic motivation learning for real robot applications

Frontiers in Robotics and AI

https://doi.org/10.3389/FROBT.2023.1102438

Abstract
sparkles

AI

The integration of humanoid robots into daily life faces significant challenges due to their reliance on pre-programmed tasks and traditional control methods, which limits adaptability. This paper discusses the potential of intrinsic motivation learning combined with mental replay techniques to enhance sample efficiency in real-world robot applications. By employing goal-directed motion alongside these learned behaviors, robots can achieve more effective learning and interaction in complex environments.

References (52)

  1. Ahmadi, A., and Tani, J. (2019). A novel predictive-coding-inspired variational RNN model for online prediction and recognition. Neural Comput. 31, 2025-2074. doi:10.1162/neco_a_01228
  2. Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., et al. (2017).
  3. "Hindsight experience replay, " in Advances in Neural Information Processing Systems. Editors I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Curran Associates, Inc.), 30. Available at: https://proceedings.neurips.cc/ paper/2017/file/453fadbd8a1a3af50a9df4df899537b5-Paper.pdf.
  4. Asada, M., Hosoda, K., Kuniyoshi, Y., Ishiguro, H., Inui, T., Yoshikawa, Y., et al. (2009). Cognitive developmental robotics: A survey. IEEE Trans. Aut. Ment. Dev. 1, 12-34. doi:10.1109/TAMD.2009.2021702
  5. Asada, M., MacDorman, K. F., Ishiguro, H., and Kuniyoshi, Y. (2001). Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robotics Aut. Syst. 37, 185-193. doi:10.1016/S0921-8890(01)00157-9
  6. Asano, Y., Okada, K., and Inaba, M. (2017). Design principles of a human mimetic humanoid: Humanoid platform to study human intelligence and internal body system. Sci. Robotics 2, eaaq0899. doi:10.1126/scirobotics.aaq0899
  7. Asfour, T., Wächter, M., Kaul, L., Rader, S., Weiner, P., Ottenhaus, S., et al. (2019).
  8. ARMAR-6: A high-performance humanoid for human-robot collaboration in real world scenarios. IEEE Robotics Automation Mag. 26, 108-121. doi:10.1109/MRA.2019.2941246
  9. Baldassarre, G. (2019). Intrinsic motivations and open-ended learning. arXiv. Frontiers in Robotics and AI 03 frontiersin.org Rayyes 10.3389/frobt.2023.1102438
  10. Baranes, A. F., Oudeyer, P.-Y., and Gottlieb, J. (2014). The effects of task difficulty, novelty and the size of the search space on intrinsically motivated exploration. Front. Neurosci. 8, 317. doi:10.3389/fnins.2014.00317
  11. Baranes, A., and Oudeyer, P. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robot. Auton. Syst. 61, 49-73. doi:10.1016/j.robot.2012.05.008
  12. Barto, A., Mirolli, M., and Baldassarre, G. (2013). Novelty or surprise? Front. Psychol. 4, 907. doi:10.3389/fpsyg.2013.00907
  13. Barto, A. G., Singh, S., and Chentanez, N. (2004). "Intrinsically motivated learning of hierarchical collections of skills, " in Proceedings of the 3rd International Conference on Development and Learning, 112. Benureau, F., and Oudeyer, P.-Y. (2016). Behavioral diversity generation in autonomous exploration through reuse of past experience. Front. Robotics AI 3, 8. doi:10.3389/frobt.2016.00008
  14. Caligiore, D., Magda Mustile, D. C., Redgrave, P., Triesch, J., Marsico, M. D., Baldassarre, G., et al. (2015). Intrinsic motivations drive learning of eye movements: An experiment with human adults. PLOS ONE 10, e0118705. doi:10.1371/journal.pone.0118705
  15. Cangelosi, A., Schlesinger, M., and Smith, L. B. (2015). Developmental robotics: From babies to robots. Massachusetts: MIT Press.
  16. Chentanez, N., Barto, A. G., and Singh, S. P. (2005). "Intrinsically motivated reinforcement learning, " in Advances in neural information processing systems (Massachusetts: MIT Press), 1281-1288.
  17. Duminy, N., Nguyen, S. M., and Duhaut, D. (2016). "Strategic and interactive learning of a hierarchical set of tasks by the poppy humanoid robot, " in 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL- EpiRob), Cergy-Pontoise, France, 19-22 September 2016 (IEEE), 204-209.
  18. Forestier, S. (2019). Intrinsically motivated goal exploration in child development and artificial intelligence: Learning and development of speech and tool use. France: Université Bordeaux.
  19. Forestier, S., and Oudeyer, P. (2016). "Modular active curiosity-driven discovery of tool use, " in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea (South), 09-14 October 2016 (IEEE), 3965-3972.
  20. Forestier, S., Portelas, R., Mollard, Y., and Oudeyer, P-Y. (2017). Intrinsically motivated goal exploration processes with automatic curriculum learning. arXiv e-prints, arXiv:1708.02190. Foster, D., and Wilson, M. (2006). Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440, 680-683. doi:10.1038/nature04587
  21. Frank, M., Leitner, J., Stollenga, M., Förster, A., and Schmidhuber, J. (2014). Curiosity driven reinforcement learning for motion planning on humanoids. Front. Neurorobotics 7, 25. doi:10.3389/fnbot.2013.00025
  22. Gerken, A., and Spranger, M. (2019). "Continuous value iteration (CVI) reinforcement learning and imaginary experience replay (IER) for learning multi-goal, continuous action and state space controllers, " in 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20-24 May 2019 (IEEE).
  23. Hart, S., and Grupen, R. (2011). Learning generalizable control programs. IEEE Trans. Aut. Ment. Dev. 3, 216-231. doi:10.1109/tamd.2010.2103311
  24. Hirai, K., Hirose, M., Haikawa, Y., and Takenaka, T. (1998). "The development of honda humanoid robot, " in Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146), Leuven, Belgium, 20-20 May 1998 (IEEE), 1321-1326. doi:10.1109/ROBOT.1998.677288
  25. Huang, S. H., Zambelli, M., Kay, J., Martins, M. F., Tassa, Y., Pilarski, P. M., et al. (2019). Learning gentle object manipulation with curiosity-driven deep reinforcement learning. CoRR abs/1903.08542.
  26. Huang, X., and Weng, J. (2004). "Motivational system for human-robot interaction, " in Computer vision in human-computer interaction (Berlin, Heidelberg: Springer Berlin Heidelberg), 17-27.
  27. Kajita, S., Hirukawa, H., Harada, K., and Yokoi, K. (2014). Introduction to humanoid robotics. Berlin: Springer.
  28. Kaplan, R., and Friston, K. J. (2018). Planning and navigation as active inference. Biol. Cybern. 112, 323-343. doi:10.1007/s00422-018-0753-2
  29. Kim, H., Jasso, H., Deak, G., and Triesch, J. (2008). "A robotic model of the development of gaze following, " in 2008 7th IEEE International Conference on Development and Learning, Monterey, CA, USA, 09-12 August 2008 (IEEE), 238-243.
  30. Lin, L-J. (1993). Reinforcement learning for robots using neural networks (Technical report, DTIC Document). Pittsburgh: Carnegie Mellon University. Lungarella, M., Metta, G., Pfeifer, R., and Sandini, G. (2003). Developmental robotics: A survey. Connect. Sci. 15, 151-190. doi:10.1080/09540090310001655110
  31. Mai, N. S. (2013). A curious robot learner for interactive goal-babbling:: Strategically choosing what, how, when and from whom to learn. Nouvelle-Aquitaine: Universite de Bordeaux. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., et al. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. Nguyen, S. M., and Oudeyer, P. (2014). Socially guided intrinsic motivation for robot learning of motor skills. Aut. Robots 36 (3), 273-294. abs/1804.07269. doi:10.1007/s10514- 013-9339-y Ogenyi, U. E., Liu, J., Yang, C., Ju, Z., and Liu, H. (2021). Physical human-robot collaboration: Robotic systems, learning methods, collaborative strategies, sensors, and actuators. IEEE Trans. Cybern. 51, 1888-1901. doi:10.1109/TCYB.2019. 2947532
  32. Oudeyer, P.-Y., and Kaplan, F. (2007). What is intrinsic motivation? A typology of computational approaches. Front. Neurorobotics 1, 6. doi:10.3389/neuro.12.006.2007
  33. Oudeyer, P., Kaplan, F., and Hafner, V. V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11, 265-286. doi:10.1109/tevc.2006.890271
  34. Parisi, G., Kemker, R., Part, J., Kanan, C., and Wermter, S. (2019). Continual lifelong learning with neural networks: A review. Neural Netw. 113, 54-71. doi:10.1016/j.neunet.2019.01.012
  35. Rayyes, R., Donat, H., and Steil, J. (2020a). Efficient online interest-driven exploration for developmental robots. IEEE Trans. Cognitive Dev. Syst. 14, 1367-1377. doi:10.1109/TCDS.2020.3001633
  36. Rayyes, R., Donat, H., and Steil, J. (2020b). "Hierarchical interest-driven goal babbling for efficient bootstrapping of sensorimotor skills, " in 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May 2020 -31 August 2020 (IEEE), 1336-1342.
  37. Rayyes, R., Donat, H., Steil, J., and Spranger, M. (2021). Interest-driven exploration with observational learning for developmental robots. IEEE Trans. Cognitive Dev. Syst. 1. doi:10.1109/TCDS.2021.3057758
  38. Rayyes, R. (2020). Efficient and stable online learning for developmental robots. Ph.D. thesis, Dissertation (Braunschweig: Technische Universität Braunschweig).
  39. Riedmiller, M., Hafner, R., Lampe, T., Neunert, M., Degrave, J., van de Wiele, T., et al. (2018). "Learning by playing solving sparse reward tasks from scratch, " in Proceedings of the 35th International Conference on Machine Learning. (PMLR) 80, 4344- 4353. Available at: http://proceedings.mlr.press/v80/riedmiller18a/riedmiller18a.pdf. Rolf, M., and Steil, J. (2014). Efficient exploratory learning of inverse kinematics on a bionic elephant trunk. IEEE Trans. Neural Netw. Learn. Syst. 25, 1147-1160. doi:10.1109/TNNLS.2013.2287890
  40. Rolf, M., Steil, J. J., and Gienger, M. (2011). "Online goal babbling for rapid bootstrapping of inverse models in high dimensions, " in IEEE Int. Conf. Development and Learning and on Epigenetic Robotics, Frankfurt am Main, Germany, 24-27 August 2011 (IEEE), 1-8.
  41. Sandini, G., Mohan, V., Sciutti, A., and Morasso, P. (2018). Social cognition for human-robot symbiosis-Challenges and building blocks. Front. neurorobotics 12, 34. doi:10.3389/fnbot.2018.00034
  42. Santucci, V., Baldassarre, G., and Mirolli, M. (2013). Which is the best intrinsic motivation signal for learning multiple skills? Front. Neurorobotics 7, 22. doi:10.3389/fnbot.2013.00022
  43. Santucci, V. G., Baldassarre, G., and Mirolli, M. (2016). Grail: A goal-discovering robotic architecture for intrinsically-motivated learning. IEEE Trans. Cognitive Dev. Syst. 8, 214-231. doi:10.1109/TCDS.2016.2538961
  44. Schmidhuber, J. (1991). "Curious model-building control systems, " in IEEE International Joint Conference on Neural Networks, Singapore, 18-21 November 1991 (IEEE), 1458-1463.
  45. Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990 -2010). IEEE Trans. Aut. Ment. Dev. 2, 230-247. doi:10.1109/TAMD.2010.2056368
  46. Schwartenbeck, P., Fitzgerald, T., Dolan, R. J., and Friston, K. (2013). Exploration, novelty, surprise, and free energy minimization. Front. Psychol. 4, 710-715. doi:10.3389/fpsyg.2013.00710
  47. Seepanomwan, K., Santucci, V. G., and Baldassarre, G. (2017). "Intrinsically motivated discovered outcomes boost user's goals achievement in a humanoid robot, " in 2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), Lisbon, Portugal, 18-21 September 2017 (IEEE), 178-183.
  48. Storck, J., Hochreiter, S., and Schmidhuber, J. (1994). Reinforcement driven information acquisition in non-deterministic environments. California: ICANN, 159-164.
  49. Tanneberg, D., Peters, J., and Rueckert, E. (2018). Intrinsic motivation and mental replay enable efficient online adaptation in stochastic recurrent networks. CoRR abs/1802.08013. Tikhanoff, V., Cangelosi, A., and Metta, G. (2010). Integration of speech and action in humanoid robots: Icub simulation experiments. IEEE Trans. Aut. Ment. Dev. 3, 17-29. doi:10.1109/TAMD.2010.2100390
  50. Van Pinxteren, M. M., Wetzels, R. W., Rüger, J., Pluymaekers, M., and Wetzels, M. (2019). Trust in humanoid robots: Implications for services marketing. J. Serv. Mark. 33, 507-518. doi:10.1108/JSM-01-2018-0045
  51. von Hofsten, C. (2004). An action perspective on motor development. Trends CogSci 8, 266-272. doi:10.1016/j.tics.2004.04.002
  52. Zhang, C., Zhao, Y., Triesch, J., and Shi, B. E. (2014). "Intrinsically motivated learning of visual motion perception and smooth pursuit, " in 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May 2014 -07 June 2014 (IEEE), 1902-1908.