Clip
Reinforcement Learning and Human Diplomacy
Successful human diplomacy players know how to take advantage of weaker players and persuade them to do things not in their interest. The language model, which is controllable, is made to play based on human compatibility to succeed in the game, using a self-play and search approach.