Academia.eduAcademia.edu

Outline

Neural Learning of Heuristic Functions for General Game Playing

2016, Machine Learning, Optimization, and Big Data

https://doi.org/10.1007/978-3-319-51469-7_7

Abstract

The proposed model represents an original approach to general game playing, and aims at creating a player able to develop a strategy using as few requirements as possible, in order to achieve the maximum generality. The main idea is to modify the known minimax search algorithm removing its task-specific component, namely the heuristic function: this is replaced by a neural network trained to evaluate the game states using results from previous simulated matches. A method for simulating matches and extracting training examples from them is also proposed, completing the automatic procedure for the setup and improvement of the model. Part of the algorithm for extracting training examples is the Backward Iterative Deepening Search, a new original search algorithm which aims at finding, in a limited time, a high number of leaves along with their common ancestors.

References (20)

  1. Silver, D., Huang, A., et al.: Mastering the game of Go with deep neural networks and tree search. In: Nature 529, no. 7587, pp. 484-489. (2016)
  2. Draper, S., Rose, A.: Sancho ggp player, http://sanchoggp.blogspot.com
  3. Michulke, D.: Neural networks for high-resolution state evaluation in general game playing. In: IJCAI-11 Workshop on General Game Playing (GIGA11), pp. 31-37. (2011)
  4. Schiffel, S., Thielscher, M.: Fluxplayer: A successful general game player. In: 22nd National Conference on Artificial intelligence, pp. 1191-1196. AAAI Press, Menlo Park(2007)
  5. Swiechowski, M., Mandziuk, J.: Specialized vs. multi-game approaches to AI in games. In: Intelligent Systems 2014, pp. 243-254. (2015)
  6. Swiechowski, M., Park, H., Mandziuk, J., Kim, K.: Recent Advances in General- Game Playing. In: The Scientific World Journal (2015)
  7. Schmidt, W.F., Kraaijveld, M., Duin, R.P.W., et al.: Feedforward Neural Networks with Random Weights. In: International Conference on Pattern Recognition, Con- ference B: Pattern Recognition Methodology and Systems, pp. 1-4. (1992)
  8. Pao, Y.H., Park, G.H., Sobajic D.J.: Learning and Generalization Characteristics of the Random Vector Functional-link Net. Neurocomputing, 6, 163-180 (1994)
  9. Huang, G.B., Chen, L., Siew, C.K.: Universal approximation using incremental con- structive feedforward networks with random hidden nodes. IEEE Transactions on Neural Networks, 17, 879-892 (2006)
  10. Liang, N.Y., Huang, G.B., Saratchandran, P., Sundararajan, N.: A fast and accu- rate online sequential learning algorithm for feedforward networks. IEEE Transac- tions on Neural Networks, 17, 1411-1423 (2006)
  11. Russell, S., Norvig, P.: Artificial Intelligence: a Modern Approach. Prentice-Hall, Egnlewood Cliffs (1995)
  12. Penrose, R.: On best approximate solutions of linear matrix equations. Mathemat- ical Proceedings of the Cambridge Philosophical Society, 52, pp. 17-19. (1956)
  13. Bishop, C.: Pattern Recognition and Machine Learning. Springer (2006)
  14. Gherrity, M.: A game-learning machine. PhD Thesis, University of California, San Diego (1993)
  15. Allis, L.W.: A knowledge-based approach of connect-four. Technical report, Vrije Universiteit, Subfaculteit Wiskunde en Informatica (1988)
  16. British Othello Federation: Game Rules, http://www.britishothello.org.uk/ rules.html
  17. Cirasella, J., Kopec, D.: The History of Computer Games. CUNY Academic Works, New York (2006)
  18. Mitchell, D.H.: Using features to evaluate positions in experts' and novices' Othello games. Masters Thesis, Northwestern University, Evanston (1984)
  19. MacGuire, S.: Strategy Guide for Reversi & Reversed Reversi, www.samsoft.org. uk/reversi/strategy.htm#position
  20. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518, no. 7540, pp. 529-533. (2015)