Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Cooperation and Reputation Dynamics with Reinforcement Learning
-
Publication Type:Conference
-
Authors:Anastassacos N, GarcĂa J, Hailes S, Musolesi M
-
Publisher:ACM
-
Publication date:07/05/2021
-
Published proceedings:Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)
-
Name of conference:20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)
-
Conference place:Virtual Event United Kingdom
-
Conference start date:03/05/2021
-
Conference finish date:07/05/2021
-
Keywords:cs.MA, cs.MA, cs.AI
-
Author URL:
-
Notes:Published in AAMAS'21, 9 pages
Abstract
Creating incentives for cooperation is a challenge in natural and artificial
systems. One potential answer is reputation, whereby agents trade the immediate
cost of cooperation for the future benefits of having a good reputation. Game
theoretical models have shown that specific social norms can make cooperation
stable, but how agents can independently learn to establish effective
reputation mechanisms on their own is less understood. We use a simple model of
reinforcement learning to show that reputation mechanisms generate two
coordination problems: agents need to learn how to coordinate on the meaning of
existing reputations and collectively agree on a social norm to assign
reputations to others based on their behavior. These coordination problems
exhibit multiple equilibria, some of which effectively establish cooperation.
When we train agents with a standard Q-learning algorithm in an environment
with the presence of reputation mechanisms, convergence to undesirable
equilibria is widespread. We propose two mechanisms to alleviate this: (i)
seeding a proportion of the system with fixed agents that steer others towards
good equilibria; and (ii), intrinsic rewards based on the idea of
introspection, i.e., augmenting agents' rewards by an amount proportionate to
the performance of their own strategy against themselves. A combination of
these simple mechanisms is successful in stabilizing cooperation, even in a
fully decentralized version of the problem where agents learn to use and assign
reputations simultaneously. We show how our results relate to the literature in
Evolutionary Game Theory, and discuss implications for artificial, human and
hybrid systems, where reputations can be used as a way to establish trust and
cooperation.
› More search options
UCL Researchers