Achieving Cooperation in a Minimally Constrained Environment
Abstract
We describe a simple environment to study cooperation be- tween two agents and a method of achieving cooperation in that environment. The environment consists of randomly gen- erated normal form games with uniformly distributed pay- offs. Agents play multiple games against each other, each game drawn independently from the random distribution. In this environment cooperation is difficult. Tit-for-Tat can- not be used because moves are not labeled as "cooperate" or "defect", fictitious play cannot be used because the agent never sees the same game twice, and approaches suitable for stochastic games cannot be used because the set of states is not finite. Our agent identifies cooperative moves by assign- ing an attitude to its opponent and to itself. The attitude de- termines how much a player values its opponents payoff, i.e how much the player is willing to deviate from strictly self- interested behavior. To cooperate, our agent estimates the at- titude of its opp...
References (14)
- Altman, A.; Bercovici-Boden, A.; and Tennenholtz, M. 2006. Learning in one-shot strategic form games. In Proc. European Conf. on Machine Learning, 6-17. Springer. Arulampalam, M. S.; Maskell, S.; Gordon, N.; and Clapp, T. 2002. A tutorial on particle filters for online nonlinear/non-gaussian Bayesian tracking. IEEE Transac- tions on Signal Processing 50(2):174-188.
- Axelrod, R. M. 1984. The evolution of cooperation. Basic Books.
- Crandall, J. W., and Goodrich, M. A. 2005. Learning to compete, compromise, and cooperate in repeated general- sum games. In Proc. of the Int'l Conf. on Machine Learn- ing, 161-168. New York, NY, USA: ACM.
- Damer, S., and Gini, M. 2008. A minimally constrained environment for the study of cooperation. Technical Report 08-013, University of Minnesota, Department of CSE Fudenberg, D., and Levine, D. K. 1998. The Theory of Learning in Games. MIT Press.
- Govindan, S., and Wilson, R. 2008. Refinements of Nash equilibrium. In Durlauf, S. N., and Blume, L. E., eds., The New Palgrave Dictionary of Economics, 2nd Edition. Palgrave Macmillan.
- Kraus, S.; Rosenschein, J. S.; and Fenster, M. 2000. Ex- ploiting focal points among alternative solutions: Two ap- proaches. Annals of Mathematics and Artificial Intelli- gence 28(1-4):187-258.
- Levine, D. K. 1998. Modeling altruism and spitefulness in experiments. Review of Economic Dynamics 1:593-622.
- McKelvey, R., and McLennan, A. 1996. Computation of equilibria in finite games. In Handbook of Computational Economics.
- Nowak, M. A., and Sigmund, K. 1993. A strategy of win- stay, lose-shift that outperforms Tit for Tat in the Prisoner's Dilemma game. Nature 364:56-58.
- Rabin, M. 1993. Incorporating fairness into game the- ory and economics. The American Economic Review 83(5):1281-1302.
- Saha, S.; Sen, S.; and Dutta, P. S. 2003. Helping based on future expectations. In Proc. of the Second Int'l Conf. on Autonomous Agents and Multi-Agent Systems, 289-296.
- Sally, D. F. 2002. Two economic applications of sympathy. The Journal of Law, Economics, and Organization 18:455- 487. Shapley, L. S. 1953. Stochastic games. Proceedings of the NAS 39:1095-1100.
- Shoham, Y.; Powers, R.; and Grenager, T. 2003. Multi- agent reinforcement learning: a critical survey. Technical Report, Stanford University.
- Trivers, T. 1971. The evolution of reciprocal altruism. Quarterly Review of Biology 36:35-57.