Bounded recursive self-improvement
Abstract
Four principal features of autonomous control systems are left both unaddressed and unaddressable by present-day engineering methodologies: 1. The ability to operate effectively in environments that are only partially known beforehand at design time;; 2. A level of generality that allows a system to re-assess and re-define the fulfillment of its mission in light of unexpected constraints or other unforeseen changes in the environment;; 3. The ability to operate effectively in environments of significant complexity;; and 4. The ability to degrade gracefully - how it can continue striving to achieve its main goals when resources become scarce, or in light of other expected or unexpected constraining factors that impede its progress. We describe new methodological and engineering principles for addressing these shortcomings, that we have used to design a machine that becomes increasingly better at behaving in underspecified circumstances, in a goal-directed way, on the job, by modeling itself and its environment as experience accumulates. Based on principles of autocatalysis, endogeny, and reflectivity, the work provides an architectural blueprint for constructing systems with high levels of operational autonomy in underspecified circumstances, starting from only a small amount of designer-specified code - a seed. Using a valuedriven dynamic priority scheduling to control the parallel execution of a vast number of lines of reasoning, the system accumulates increasingly useful models of its experience, resulting in recursive self-improvement that can be autonomously sustained after the machine leaves the lab, within the boundaries imposed by its designers. A prototype system has been implemented and demonstrated to learn a complex real-world task -real-time multimodal dialogue with humans -by on-line observation. Our work presents solutions to several challenges that must be solved for achieving artificial general intelligence.
References (33)
- Bromberg, M. & Landré, A. (1993). Analyse de la Structure Interactionnelle et des Stratégies discursive dans un talk-show. Psychologie Francaise, 38 (2), 99-109.
- Franklin, S., Madl, T., D'Mello, Sidney K., & Snaider, J. (2013 to appear). LIDA: A Systems-level Architecture for Cognition, Emotion, and Learning. Transactions on Autonomous Mental Development.
- Helgason, H. P., E. Nivel & K. R. Thórisson (2012). On Attention Mechanisms for AGI RUTR-SCS13006 55/56 Architectures: A Design Proposal. In J. Bach, B. Goertzel & M. Ilké (eds.), Proceedings of the fifth Artificial General Intelligence, Oxford University, December 8- 11, 89-98.
- Helgason, H. P. & K. R. Thórisson (2012). Attention Capabilities for AI Systems. Proceedings of the ninth International Conference on Informatics in Control, Automation and Robotics, Rome, Italy, July.
- Helgason H. P. (2013) General Attention Mechanism for Artificial Intelligence Systems, PhD thesis, Reykjavik University, June 2013.
- Hutter, M. (2005) Universal Artificial Intelligence: Sequential decisions based on algorithmic probability. Berlin: Springer.
- Jonsson, G. K. (2006). Personnality and Self-Esteem in Social Interaction. In From Communication to Presence: Cognition, Emotions and Culture towards the Ultimate Communicative Experience. Edited by Riva G. et al. IOS Press, ISBN 1-58603-662-9.
- Jonsson, G. K. & Thórisson K. R. (2010). Evaluating Multimodal Human-Robot Interaction: A Case Study of an Early Humanoid Prototype. In A.J. Spinks, F. Grieco, O.E. Krips, I.W.S. Loijens, I.P.J.J. Noldus and P.H. Zimmerman (eds.), Measuring Behavior 2010: Proceedings of the 7th International Conference on Methods and Techniques in Behavioral Research, 273-276. ACM New York, NY, USA 2010.
- Laird, J. E. (2012). The Soar Cognitive Architecture, MIT Press.
- Magnusson, M. S. (1996). Hidden Real-Time Patterns in Intra-and Inter-Individual Behavior: Description and Detection. European Journal of Psychological Assessment, Vol. 12, Issue 2, p. 112-123.
- Magnusson, M. S. (2000) Discovering hidden time Patterns in Behavior: T-patterns and their detection. Behavior Research Methods, Instruments & Computers, 32, p. 93- 110. McGrew, W. C. (1972). An Ethological Study of Children's Behaviour. New York: Lawrence Erlbaum Associates.
- Newell, A., Shaw, J. C. & Simon, H.A. (1959). Report on a general problem-solving program. Proceedings of the International Conference on Information Processing. p. 256-264.
- Nivel, E. & Thórisson, K. R. (2009) Self-Programming: Operationalizing Autonomy. Proceedings of the Second Conference on Artificial General Intelligence. 2009.
- Nivel, E. & Thórisson, K. R. (2008). Prosodica Real-Time Prosody Tracker. Reykjavik University School of Computer Science Technical Report RUTR08002.
- Nivel, E. & K. R. Thórisson (2013). Seed Specification for AERA S1 in Experiments 1 & 2. Reykjavik University School of Computer Science Technical Report, RUTR- SCS13005.
- Sacks, H., Schegloff, E. A.. and Jefferson, G. A. (1974) A Simplest Systematics for the Organization of Turn-Taking in Conversation. Language, 50, 1974, 696-735.
- Sanz, R. & López, I. (2000). Minds, MIPS and structural feedback. In Performance Metrics for Intelligent Systems, PerMIS '2000, Gaithersburg, USA.
- Sanz, R., Matía, F., & Galán, S. (2000). Fridges, elephants and the meaning of autonomy and intelligence. In IEEE International Symposium on Intelligent Control, ISIC'2000, Patras, Greece.
- Sanz, R. (2002). An Integrated Control Model of Consciousness. Proceedings of the conference Toward a Science of Consciousness.
- Sanz, R. & Hernández, C. (2012). Towards architectural foundations for cognitive self- RUTR-SCS13006 56/56 aware systems. In Proc. Biologically Inspired Cognitive Architectures. Palermo, 2012. Springer.
- Simon, H. (1957). A Behavioral Model of Rational Choice. In Models of Man, Social and Rational: Mathematical Essays on Rational Human Behavior in a Social Setting. New York: Wiley.
- Schmidhuber, J., Zhao, J., Schraudolph, N. (1997). Reinforcement learning with self- modifying policies. In S. Thrun and L. Pratt, eds., Learning to learn, Kluwer, pages 293-309.
- Schmidhuber, J. (2006) Gődel machines: Fully self-referential optimal universal self- improvers. In B. Goertzel and C. Pennachin, editors, Artificial General Intelligence, p 199-226. Springer Verlag.
- Steunebrink, B. R., Schmidhuber, J. (2012). Towards an Actual Gödel Machine Im- plementation. In P. Wang, B. Goertzel, eds., Theoretical Foundations of Artificial Gen- eral Intelligence. Springer.
- Steunebrink, B. R., Koutnik J., Thórisson K. R., Nivel E. & Schmidhuber J. (2013). Resource-Bounded Machines are Motivated to be Efficient, Effective, and Curious. In K-U Kühnberger, S. Rudolph and P. Wang (eds.), Proceedings of the Sixth Conference on Artificial General Intelligence (AGI-13), 119-129, Beijing, China.
- Thórisson, K. R. & Nivel, E. (2009). Achieving Artificial General Intelligence Through Peewee Granularity. Proceedings of the Second Conference on Artificial General Intelligence, 222-223, Arlington, VA, USA, March 6-9
- Thórisson, K. R. & Nivel, E. (2009). Holistic Intelligence: Transversal Skills and Current Methodologies. Proceedings of the Second Conference on Artificial General Intelligence, 220-221, Arlington, VA, USA, March 6-9
- Thórisson, K. R. (2009). From Constructionist to Constructivist A.I. Keynote, AAAI Fall Symposium Series -Biologically Inspired Cognitive Architectures, Washington D.C., November 5-7, 175-183. AAAI Tech Report FS-09-01, AAAI press, Menlo Park, CA Thórisson, K. R. (2012). A New Constructivist AI: From Manual Construction to Self- Constructive Systems. In P. Wang and B. Goertzel (eds.), Theoretical Foundations of Artificial General Intelligence. Atlantis Thinking Machines, 4:145-171.
- Thórisson, K. R. & Magnusson, M. S. (2013). Evaluating AGI-Aspiring Systems Via Human-Robot Interaction Using T-Patterns. Reykjavik University School of Computer Science Technical Report, RUTR-SCS13004.
- Thórisson, K. R. (forthcoming). Methodology Matters: Constructionism Challenged, Constructivism Challenges.
- Veness, J., Ng K. S., Hutter. M. Uther, W. & Silver, D. (2011) A Monte-Carlo AIXI Ap- proximation. Journal of Artificial Intelligence Research (JAIR), 2011
- Wang, P. (2006). Rigid Flexibility: The Logic of Intelligence. Springer, Dordrecht. 2006.
- Wang, P. (2011). The assumptions on knowledge and resources in models of rationality. International Journal of Machine Consciousness, 3(1):193-218.