Speaking Skills Talk - Carlos Martin

— 3:00pm

Location:
In Person - Newell-Simon 4305

Speaker:
CARLOS MARTIN, Ph.D. Student, Computer Science Department, Carnegie Mellon University
http://carlosgmartin.com/


Finding Mixed-Strategy Nash Equilibria of Black-Box Continuous-Action Games Using Randomized Policy Networks and Joint-Perturbation Simultaneous Pseudo-Gradients

We study the problem of computing an approximate Nash equilibrium of a continuous-action game without access to gradients of the utility function. This problem arises for multi-agent reinforcement learning settings where the environment is treated as a black box. It also arises for games whose payoffs are discontinuous with respect to players' actions, such as auctions. To tackle this problem, we apply zeroth-order optimization techniques that combine smoothed gradient estimators with equilibrium-finding dynamics.

We model players' strategies using randomized policy networks. These networks take noise in addition to observations as input, and can flexibly approximate arbitrary observation-dependent continuous-action distributions. This representation power is crucial for tackling continuous-action games that lack pure-strategy equilibria.

In addition, we introduce a new training technique that reduces the number of utility function evaluations per iteration from linear to constant in the number of players. It achieves this by performing a single joint perturbation of all players' strategies, rather than perturbing each one individually. This yields dramatic benefits for many-player games in which each evaluation incurs costs in wall time, memory, money, or other resources.

We apply our technique to various auctions and resource-allocation games. Experiments show that our technique quickly finds high-quality approximate Nash equilibria of these games. 

This is joint work with Tuomas Sandholm.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement

Event Website:
https://csd.cmu.edu/calendar/speaking-skills-talk-carlos-martin