Safety Engineering for Artificial General Intelligence

Joshua Fox

doi:10.1007/S11245-012-9128-9

Outline

Safety Engineering for Artificial General Intelligence

Joshua Fox

2012

https://doi.org/10.1007/S11245-012-9128-9

visibility

…

description

21 pages

link

1 file

Abstract

Abstract Machine ethics and robot rights are quickly becoming hot topics in artificial intelligence and robotics communities. We will argue that attempts to attribute moral agency and assign rights to all intelligent machines are misguided, whether applied to infrahuman or superhuman AIs, as are proposals to limit the negative effects of AIs by constraining their behavior. As an alternative, we propose a new science of safety engineering for intelligent artificial agents based on maximizing for what humans value.

References (74)

Allen, Colin, Iva Smit, and Wendell Wallach. 2005. "Artificial Morality: Top-Down, Bottom-Up, and Hybrid Approaches." In "Ethics of New Information Technology Papers from CEPE 2005." Ethics and Information Technology 7 (3): 149-155. doi:10.1007/s10676-006-0004-4.
Allen, Colin, Gary Varner, and Jason Zinser. 2000. "Prolegomena to Any Future Artificial Moral Agent." In "Philosophical Foundations of Artificial Intelligence." Special issue, Journal of Experimental & Theoretical Artificial Intelligence 12 (3): 251-261. doi:10.1080/09528130050111428.
Allen, Colin, Wendell Wallach, and Iva Smit. 2006. "Why Machine Ethics?" IEEE Intelligent Systems 21 (4): 12-17. doi:10.1109/MIS.2006.83.
Anderson, Michael, and Susan Leigh Anderson. 2007. "Machine Ethics: Creating an Ethical Intelligent Agent." AI Magazine 28 (4): 15-26. http://www.aaai.org/ojs/index.php/aimagazine/ article/view/2065/2052.
Armstrong, Stuart, Anders Sandberg, and Nick Bostrom. 2012. "Thinking Inside the Box: Using and Controlling an Oracle AI." Minds and Machines. doi:10.1007/s11023-012-9282-2.
Arneson, Richard J. 1999. "What, if Anything, Renders All Humans Morally Equal?" In Singer and His Critics, edited by Dale Jamieson. Philosophers and Their Critics 8. Malden, MA: Blackwell.
Asimov, Isaac. 1942. "Runaround." Astounding Science-Fiction, March, 94-103.
Berg, Paul, David Baltimore, Sydney Brenner, Richard O. Roblin, and Maxine F. Singer. 1975. "Summary Statement of the Asilomar Conference on Recombinant DNA Molecules." Proceedings of the National Academy of Sciences of the United States of America 72 (6): 1981-1984. doi:10.1073/pnas.72.6. 1981.
Bishop, Mark. 2009. "Why Computers Can't Feel Pain." In "Computation and the Natural World," edited by Colin T. A. Schmidt. Special issue, Minds and Machines 19 (4): 507-516. doi:10.1007/s11023- 009-9173-3.
Bostrom, Nick. 2002. "Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards." Journal of Evolution and Technology 9. http://www.jetpress.org/volume9/risks.html. . 2006. "How Long Before Superintelligence?" Linguistic and Philosophical Investigations 5 (1): 11-30.
Bostrom, Nick, and Eliezer Yudkowsky. Forthcoming. "The Ethics of Artificial Intelligence." In Cam- bridge Handbook of Artificial Intelligence, edited by Keith Frankish and William Ramsey. New York: Cambridge University Press.
Butler, Samuel [Cellarius, pseud.]. 1863. "Darwin Among the Machines." Christchurch Press, June 13. http://www.nzetc.org/tm/scholarly/tei-ButFir-t1-g1-t1-g1-t4-body.html. . 1872. Erewhon; or, Over the range. London: Trübner.
Chalmers, David John. 2010. "The Singularity: A Philosophical Analysis." Journal of Consciousness Studies 17 (9-10): 7-65. http://www.ingentaconnect.com/content/imp/jcs/2010/00000017/ f0020009/art00001.
Churchland, Patricia S. 2011. Braintrust: What Neuroscience Tells Us about Morality. Princeton, NJ: Prince- ton University Press.
Clarke, Roger. 1993. "Asimov's Laws of Robotics: Implications for Information Technology, Part 1." Computer 26 (12): 53-61. doi:10.1109/2.247652. . 1994. "Asimov's Laws of Robotics: Implications for Information Technology, Part 2." Computer 27 (1): 57-66. doi:10.1109/2.248881.
de Garis, Hugo. 2005. The Artilect War: Cosmists vs. Terrans: A Bitter Controversy Concerning Whether Hu- manity Should Build Godlike Massively Intelligent Machines. Palm Springs, CA: ETC Publications.
Dennett, Daniel C. 1978. "Why You Can't Make a Computer That Feels Pain." In "Automaton-Theoretical Foundations of Psychology and Biology, Part I." Synthese 38 (3): 415-456. doi:10.1007/BF00486638.
Drescher, Gary L. 2006. Good and Real: Demystifying Paradoxes from Physics to Ethics. Bradford Books. Cambridge, MA: MIT Press.
Drexler, K. Eric. 1986. Engines of Creation. Garden City, NY: Anchor.
Eden, Amnon, Johnny Søraker, James H. Moor, and Eric Steinhart, eds. 2012. Singularity Hypotheses: A Scientific and Philosophical Assessment. The Frontiers Collection. Berlin: Springer.
Fox, Joshua. 2011. "Morality and Super-Optimizers." Paper presented at the Future of Humanity Con- ference, Van Leer Institute, Jerusalem, October 24.
Fox, Joshua, and Carl Shulman. 2010. "Superintelligence Does Not Imply Benevolence." In Mainzer 2010.
Gauthier, David P. 1986. Morals by Agreement. New York: Oxford University Press. doi:10 . 1093 / 0198249926.001.0001.
Gavrilova, Marina L., and Roman V. Yampolskiy. 2011. "Applying Biometric Principles to Avatar Recog- nition." In Transactions on Computational Science XII : Special Issue on Cyberworlds, edited by Marina L. Gavrilova, C. J. Kenneth Tan, Alexei Sourin, and Olga Sourina, 140-158. Lecture Notes in Computer Science 6670. Berlin: Springer. doi:10.1007/978-3-642-22336-5_8.
Goertzel, Ben. 2011. "Does Humanity Need an AI Nanny?" H+ Magazine,August 17. http://hplusmagazine. com/2011/08/17/does-humanity-need-an-ai-nanny/.
Good, Irving John. 1965. "Speculations Concerning the First Ultraintelligent Machine." In Advances in Computers, edited by Franz L. Alt and Morris Rubinoff, 31-88. Vol. 6. New York: Academic Press. doi:10.1016/S0065-2458(08)60418-0.
Gordon, Diana F. 1998. "Well-Behaved Borgs, Bolos, and Berserkers." In Proceedings of the 15th In- ternational Conference on Machine Learning (ICML-98), edited by Jude W. Shavlik, 224-232. San Francisco, CA: Morgan Kaufmann.
Gordon-Spears, Diana F. 2003. "Asimov's Laws: Current Progress." In Formal Approaches to Agent-Based Systems: Second International Workshop, FAABS 2002, Greenbelt, MD, USA, October 29-31, 2002. Re- vised Papers, edited by Michael G. Hinchey, James L. Rash, Walter F. Truszkowski, Christopher Rouff, and Diana F. Gordon-Spears, 257-259. Lecture Notes in Computer Science 2699. Berlin: Springer. doi:10.1007/978-3-540-45133-4_23.
Grau, Christopher. 2006. "There Is No 'I' in 'Robot': Robots and Utilitarianism." IEEE Intelligent Systems 21 (4): 52-55. doi:10.1109/MIS.2006.81.
Guo, Shesen, and Ganzhou Zhang. 2009. "Robot Rights." Science 323 (5916): 876. doi:10 . 1126 / science.323.5916.876a.
Hall, John Storrs. 2007a. Beyond AI: Creating the Conscience of the Machine. Amherst, NY: Prometheus Books. . 2007b. "Self-Improving AI: An Analysis." Minds and Machines 17 (3): 249-259. doi:10.1007/ s11023-007-9065-3.
Hanson, Robin. 2009. "Prefer Law to Values." Overcoming Bias (blog), October 10. http : / / www . overcomingbias.com/2009/10/prefer-law-to-values.html.
Hobbes, Thomas. (1651) 1998. Leviathan. Oxford World's Classics. Reprint, New York: Oxford Univer- sity Press.
Hutter, Marcus. 2005. Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probabil- ity. Texts in Theoretical Computer Science. Berlin: Springer. doi:10.1007/b138233.
Joy, Bill. 2000. "Why the Future Doesn't Need Us." Wired, April. http://www.wired.com/wired/ archive/8.04/joy.html.
Kaczynski, Theodore. 1995. "Industrial Society and Its Future." Washington Post, September 19.
Kurzweil, Ray. 2005. The Singularity Is Near: When Humans Transcend Biology. New York: Viking.
LaChat, Michael R. 1986. "Artificial Intelligence and Ethics: An Exercise in the Moral Imagination." AI Magazine 7 (2): 70-79. http://www.aaai.org/ojs/index.php/aimagazine/article/ view/540/476.
Legg, Shane. 2006. "Unprovability of Friendly AI." Vetta Project (blog), September 15. http://www. vetta.org/2006/09/unprovability-of-friendly-ai/.
Legg, Shane, and Marcus Hutter. 2007. "Universal Intelligence: A Definition of Machine Intelligence." Minds and Machines 17 (4): 391-444. doi:10.1007/s11023-007-9079-x.
Lin, Patrick, Keith Abney, and George Bekey. 2011. "Robot Ethics: Mapping the Issues for a Mecha- nized World." Edited by Randy Goebel and Mary-Anne Williams. Special review issue, Artificial Intelligence 175 (5-6): 942-949. doi:10.1016/j.artint.2010.11.026.
Mainzer, Klaus, ed. 2010. ECAP10: VIII European Conference on Computing and Philosophy. Munich: Dr. Hut. McCauley, Lee. 2007. "AI Armageddon and the Three Laws of Robotics." Ethics and Information Tech- nology 9 (2): 153-164. doi:10.1007/s10676-007-9138-2.
McDermott, Drew. 2008. "Why Ethics is a High Hurdle for AI." Paper presented at the 2008 North American Conference on Computing and Philosophy, Indiana University, Bloomington, July 10-12.
Accessed May 18, 2012. http : / / cs -www . cs . yale . edu / homes / dvm / papers / ethical - machine.pdf.
Moor, James H. 2006. "The Nature, Importance, and Difficulty of Machine Ethics." IEEE Intelligent Systems 21 (4): 18-21. doi:10.1109/MIS.2006.80.
Muehlhauser, Luke, and Louie Helm. 2012. "The Singularity and Machine Ethics." In Eden, Søraker, Moor, and Steinhart 2012.
Omohundro, Stephen M. 2008. "The Basic AI Drives." In Artificial General Intelligence 2008: Proceedings of the First AGI Conference, edited by Pei Wang, Ben Goertzel, and Stan Franklin, 483-492. Frontiers in Artificial Intelligence and Applications 171. Amsterdam: IOS.
Pierce, Margaret Anne, and John W. Henry. 1996. "Computer Ethics: The Role of Personal, Informal, and Formal Codes." Journal of Business Ethics 15 (4): 425-437. doi:10.1007/BF00380363.
Powers, Thomas M. 2006. "Prospects for a Kantian Machine." IEEE Intelligent Systems 21 (4): 46-51. doi:10.1109/MIS.2006.77.
Pynadath, David V., and Milind Tambe. 2002. "Revisiting Asimov's First Law: A Response to the Call to Arms." In Intelligent Agents VIII : Agent Theories, Architectures, and Languages 8th International Workshop, ATAL 2001 Seattle, WA, USA, August 1-3, 2001 Revised Papers, edited by John-Jules Ch. Meyer and Milind Tambe, 307-320. Berlin: Springer. doi:10.1007/3-540-45448-9_22.
Rappaport, Z. H. 2006. "Robotics and Artificial Intelligence: Jewish Ethical Perspectives." In Medical Technologies in Neurosurgery, edited by Christopher Nimsky and Rudolf Fahlbusch, 9-12. Acta Neu- rochirurgica Supplementum 98. Vienna: Springer. doi:10.1007/978-3-211-33303-7_2.
Roth, Daniel. 2009. "Do Humanlike Machines Deserve Human Rights?" Wired, January 19. http : //www.wired.com/culture/culturereviews/magazine/17-02/st_essay.
Ruvinsky, Alicia I. 2007. "Computational Ethics." In Encyclopedia of Information Ethics and Security, edited by Marian Quigley, 76-82. IGI Global. doi:10.4018/978-1-59140-987-8.ch012.
Salamon, Anna, Steve Rayhawk, and János Kramár. 2010. "How Intelligible is Intelligence?" In Mainzer 2010.
Sawyer, Robert J. 2007. "Robot Ethics." Science 318 (5853): 1037. doi:10.1126/science.1151606.
Sharkey, Noel. 2008. "The Ethical Frontiers of Robotics." Science 322 (5909): 1800-1801. doi:10.1126/ science.1164582.
Sotala, Kaj. 2010. "From Mostly Harmless to Civilization-Threatening: Pathways to Dangerous Artificial Intelligences." In Mainzer 2010. . 2012. "Advantages of Artificial Intelligences, Uploads, and Digital Minds." International Journal of Machine Consciousness 4 (1): 275-291. doi:10.1142/S1793843012400161.
Sparrow, Robert. 2007. "Killer Robots." Journal of Applied Philosophy 24 (1): 62-77. doi:10.1111/j. 1468-5930.2007.00346.x.
Spears, Diana F. 2006. "Assuring the Behavior of Adaptive Agents." In Agent Technology from a For- mal Perspective, edited by Christopher Rouff, Michael Hinchey, James Rash, Walter Truszkowski, and Diana F. Gordon-Spears, 227-257. NASA Monographs in Systems and Software Engineering. London: Springer. doi:10.1007/1-84628-271-3_8.
Tonkens, Ryan. 2009. "A Challenge for Machine Ethics." Minds and Machines 19 (3): 421-438. doi:10. 1007/s11023-009-9159-1.
Tooby, John, and Leda Cosmides. 1992. "The Psychological Foundations of Culture." In The Adapted Mind : Evolutionary Psychology and the Generation of Culture, edited by Jerome H. Barkow, Leda Cos- mides, and John Tooby, 19-136. New York: Oxford University Press.
Vassar, Michael. 2005. "AI Boxing (Dogs and Helicopters)." SL4. August 2. Accessed January 18, 2012. http://sl4.org/archive/0508/11817.html.
Veruggio, Gianmarco. 2010. "Roboethics." IEEE Robotics & Automation Magazine,June, 105-109. doi:10. 1109/MRA.2010.936959.
von Ahn, Luis, Manuel Blum, Nicholas J. Hopper, and John Langford. 2003. "CAPTCHA: Using Hard AI Problems for Security." In Advances in Cryptology -EUROCRYPT 2003: International Conference on the Theory and Applications of Cryptographic Techniques, Warsaw, Poland, May 4-8, 2003 Proceedings, edited by Eli Biham, 293-311. Lecture Notes in Computer Science 2656. Berlin: Springer. doi:10. 1007/3-540-39200-9_18.
Voss, Peter. 2007. "Essentials of General Intelligence: The Direct Path to Artificial General Intelligence." In Artificial General Intelligence, edited by Ben Goertzel and Cassio Pennachin, 131-157. Cognitive Technologies. Berlin: Springer. doi:10.1007/978-3-540-68677-4_4.
Wallach, Wendell, and Colin Allen. 2006. "EthicALife: A New Field of Inquiry." Paper presented at Eth- icALife: An ALifeX Workshop, Bloomington, IN, June 3-7. http://ethicalife.dynalias. org/Allen-Wallach.pdf. . 2009. Moral Machines: Teaching Robots Right from Wrong. New York: Oxford University Press. doi:10.1093/acprof:oso/9780195374049.001.0001.
Warwick, Kevin. 2003. "Cyborg Morals, Cyborg Values, Cyborg Ethics." Ethics and Information Technol- ogy 5 (3): 131-137. doi:10.1023/B:ETIN.0000006870.65865.cf.
Weld, Daniel, and Oren Etzioni. 1994. "The First Law of Robotics (A Call to Arms)." In Proceedings of the Twelfth National Conference on Artificial Intelligence, edited by Barbara Hayes-Roth and Richard E. Korf, 1042-1047. Menlo Park, CA: AAAI Press. http://www.aaai.org/Papers/AAAI/ 1994/AAAI94-160.pdf.
Wright, Robert. 2001. Nonzero: The Logic of Human Destiny. New York: Vintage.
Yampolskiy, Roman V. 2011a. "Artificial Intelligence Safety Engineering: Why Machine Ethics is a Wrong Approach." Paper presented at the Philosophy and Theory of Artificial Intelligence (PT- AI 2011), Thessaloniki, Greece, October 3-4. . 2011b. "What to Do with the Singularity Paradox?" Paper presented at the Philosophy and Theory of Artificial Intelligence (PT-AI 2011), Thessaloniki, Greece, October 3-4. . 2012a. "AI-Complete CAPTCHAs as Zero Knowledge Proofs of Access to an Artificially In- telligent System." ISRN Artificial Intelligence 2012:271878. doi:10.5402/2012/271878. . 2012b. "Leakproofing the Singularity: Artificial Intelligence Confinement Problem." Journal of Consciousness Studies 2012 (1-2): 194-214. http://www.ingentaconnect.com/content/imp/ jcs/2012/00000019/F0020001/art00014. . 2013. "Turing Test as a Defining Feature of AI-Completeness." In Artificial Intelligence, Evolu- tionary Computing and Metaheuristics: In the Footsteps of Alan Turing, edited by Xin-She Yang, 3-17. Studies in Computational Intelligence 427. Berlin: Springer. doi:10.1007/978-3-642-29694- 9_1.
Yampolskiy, Roman V., and Joshua Fox. 2012. "Artificial General Intelligence and the Human Mental Model." In Eden, Søraker, Moor, and Steinhart 2012.
Yampolskiy, Roman V., and Marina L. Gavrilova. 2012. "Artimetrics: Biometrics for Artificial Entities." IEEE Robotics & Automation Magazine, no. 4, 48-58. doi:10.1109/MRA.2012.2201574.
Yampolskiy, Roman V., and Venu Govindaraju. 2008. "Behavioral Biometrics for Verification and Recog- nition of Malicious Software Agents." In Sensors, and Command, Control, Communications, and In- telligence (C3I) Technologies for Homeland Security and Homeland Defense VII : 17-20 March 2008, Or- lando, Florida, USA, edited by Edward M. Carapezza, 694303. Proceedings of SPIE 6943. Belling- ham, WA: SPIE. doi:10.1117/12.773554.
Yudkowsky, Eliezer. 2002. "The AI-Box Experiment." Accessed January 15, 2012. http://yudkowsky. net/singularity/aibox. . 2007. "The Logical Fallacy of Generalization from Fictional Evidence." Less Wrong (blog), Oc- tober 16. http://lesswrong.com/lw/k9/the_logical_fallacy_of_generalization_ from/. . 2008. "Artificial Intelligence as a Positive and Negative Factor in Global Risk." In Global Catas- trophic Risks, edited by Nick Bostrom and Milan M. Ćirković, 308-345. New York: Oxford Univer- sity Press. . 2010. Timeless Decision Theory.The Singularity Institute, San Francisco, CA. http://intelligence. org/files/TDT.pdf. . 2011a. "Complex Value Systems in Friendly AI." In Artificial General Intelligence: 4th Interna- tional Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011. Proceedings, edited by Jür- gen Schmidhuber, Kristinn R. Thórisson, and Moshe Looks, 388-393. Lecture Notes in Computer Science 6830. Berlin: Springer. doi:10.1007/978-3-642-22887-2_48. . 2011b. "Open Problems in Friendly Artificial Intelligence." Paper presented at Singularity Sum- mit 2011, New York, October 15-16. http://www.youtube.com/watch?v=MwriJqBZyoM.

Safety Engineering for Artificial General Intelligence

Sign up for access to the world's latest research

Abstract

Related papers

References (74)

Related papers

Cited by