Strategic Features for General Games
2019, ArXiv
Abstract
This short paper describes an ongoing research project that requires the automated self-play learning and evaluation of a large number of board games in digital form.We describe the approach we are taking to determine relevant features, for biasing MCTS playouts for arbitrary games played on arbitrary geometries. Benefits of our approach include efficient implementation, the potential to transfer learnt knowledge to new contexts, and the potential to explain strategic knowledge embedded in features in human-comprehensible terms.
References (21)
- Aggarwal, C. C., and Han, J., eds. 2014. Frequent Pattern Mining. Springer.
- Araki, N.; Yoshida, K.; Tsuruoka, Y.; and Tsujii, J. 2007. Move prediction in Go with the maximum entropy method. In Proceedings of the 2007 IEEE Symposium on Computa- tional intelligence and Games, 189-195. IEEE.
- Borvo, A. 1977. Anatomie d'un jeu de cartes: L'Aluette ou le Jeu de Vache. Nantes: Librairie Nantaise Yves Vachon.
- Bouzy, B., and Chaslot, G. 2005. Bayesian generation and integration of K-nearest-neighbor patterns for 19x19 Go. In Kendall, G., and Lucas, S., eds., Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Games, 176-181. IEEE.
- Browne, C.; Powley, E.; Whitehouse, D.; Lucas, S.; Cowl- ing, P. I.; Rohlfshagen, P.; Tavener, S.; Perez, D.; Samoth- rakis, S.; and Colton, S. 2012. A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games 4(1):1-49.
- Browne, C. 2013. A problem case for UCT. IEEE Trans- actions on Computational Intelligence and AI in Games 5(1):69-74.
- Coulom, R. 2007. Computing "ELO ratings" of move pat- terns in the game of Go. ICGA Journal 30(4):198-208.
- Enderton, H. D. 1991. The Golem Go program. Techni- cal Report CMU-CS-92-101, School of Computer Science, Carnegie-Mellon University.
- Finnsson, H., and Björnsson, Y. 2010. Learning simula- tion control in general game-playing agents. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelli- gence, 954-959. AAAI Press.
- Gelly, S., and Silver, D. 2007. Combining online and offline knowledge in UCT. In Proceedings of the 24th International Conference on Machine Learning, 273-280.
- Gelly, S.; Wang, Y.; Munos, R.; and Teytaud, O. 2006. Mod- ification of UCT with patterns in Monte-Carlo Go. Technical Report RR-6062, INRIA, Paris.
- Hoock, J.-P., and Teytaud, O. 2010. Bandit-based genetic programming. In Esparcia-Alcázar, A. I.; Ekárt, A.; Silva, S.; Dignum, S.; and Uyar, A. S ¸., eds., European Conference on Genetic Programming, volume 6021 of Lecture Notes in Computer Science, 268-277. Springer.
- Lorentz, R. J., and Zosa IV, T. E. 2017. Machine learning in the game of Breakthrough. In Winands, M. H. M.; van den Herik, H.; and Kosters, W. A., eds., Advances in Computer Games, volume 10664 of Lecture Notes in Computer Sci- ence, 140-150. Springer.
- Murray, H. 1952. A History of Board-Games Other Than Chess. Clarendon Press.
- Raiko, T., and Peltonen, J. 2008. Application of UCT search to the connection games of Hex, Y, *Star, and Renkula! In Proceedings of the Finnish Artificial Intelligence Confer- ence, 89-93.
- Silver, D.; Huang, A.; Maddison, C.; Guez, A.; Sifre, L.; van den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; Dieleman, S.; Grewe, D.; Nham, J.; Kalchbrenner, N.; Sutskever, I.; Lillicrap, T.; Leach, M.; Kavukcuoglu, K.; Graepel, T.; and Hassabis, D. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484-489.
- Skowronski, P.; Björnsson, Y.; and Winands, M. H. M. 2009. Automated discovery of search-extension features. In van den Herik, H. J., and Spronck, P., eds., Advances in Computer Games, volume 6048 of Lecture Notes in Com- puter Science. Springer, Berlin, Heidelberg.
- Stern, D.; Herbrich, R.; and Graepel, T. 2006. Bayesian pattern ranking for move prediction in the game of Go. In Cohen, W. W., and Moore, A., eds., Proceedings of the 23rd International Conference on Machine Learning, 873-880.
- Stoutamire, D. 1991. Machine learning, game play, and Go. Technical Report TR 91-128, Center for Automation and In- telligent Systems Research, Case Western Reserve Univer- sity.
- Sturtevant, N. R., and White, A. M. 2007. Feature con- struction for reinforcement learning in Hearts. In van den Herik, H. J.; Ciancarini, P.; and Donkers, H. H. L. M., eds., Computers and Games, volume 4630 of Lecture Notes in Computer Science, 122-134. Springer.
- van der Werf, E.; Uiterwijk, J. W. H. M.; Postma, E.; and van den Herik, J. 2003. Local move prediction in Go. In Schaeffer, J.; Müller, M.; and Björnsson, Y., eds., Comput- ers and Games, volume 2883 of Lecture Notes in Computer Science. Springer.