Accumulating Risk Capital Through Investing in Cooperation

Read the paper: http://www.ifaamas.org/Proceedings/aamas2021/pdfs/p1073.pdf Chat about this paper on Discord: https://discord.com/channels/827790531085336607/833027070846173234 Discord is intended for in-depth chat with authors, independently from the AAMAS schedule. Please post any questions to the...

Full description

Bibliographic Details
Main Authors: International Conference on Autonomous Agents and Multi-Agent Systems 2021, Critch, Andrew, Dennis, Michael, Roman, Charlotte, Russell, Stuart
Format: Article in Journal/Newspaper
Language:unknown
Published: Underline Science Inc. 2021
Subjects:
Online Access:https://dx.doi.org/10.48448/p2cg-m986
https://underline.io/lecture/15418-accumulating-risk-capital-through-investing-in-cooperation
Description
Summary:Read the paper: http://www.ifaamas.org/Proceedings/aamas2021/pdfs/p1073.pdf Chat about this paper on Discord: https://discord.com/channels/827790531085336607/833027070846173234 Discord is intended for in-depth chat with authors, independently from the AAMAS schedule. Please post any questions to the authors you would like them to address during the live Q&A right here on Underline. Watch the video on SlidesLive: https://slideslive.com/38954847/accumulatingriskcapitalthroughinvesting-charlotteroman-michaeldennis-38954847-upGX.mp4 Abstract: Recent work on promoting cooperation in multi-agent learning has resulted in many methods which successfully promote cooperation at the cost of becoming more vulnerable to exploitation by malicious actors. We show that this is an unavoidable trade-off and propose an objective which balances these concerns, promoting both safety and long-term cooperation. Moreover, the trade-off between safety and cooperation is not severe, and you can receive exponentially large returns through cooperation from a small amount of risk. We study both an exact solution method and propose a method for training policies that targets this objective, Accumulating Risk Capital Through Investing in Cooperation (ARCTIC), and evaluate them in iterated Prisoner's Dilemma and Stag Hunt.