using open_spiel to train and test various MARL algorithms on dominoes
In this section, relevant papers, resources, and other links are listed. The list is not exhaustive, and is meant to be a starting point for further research.
~: To revisit
N: Not read
Y: Read
Papers | Category | Reference | Status |
Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation | TMCor | Farina et al. '21 | ~ |
Multi-Agent Coordination in Adversarial Environments | MARL | Cacciamani et al. '21 | Y |
Steering No-Regret Learners to Optimal Equilibria | Equilibria* | Zhang et al. '23 | ~ |
Hindsight and Sequential Rationality of Correlated Play | Sequential Equilibria | Morrill. et al '20 | N |
Learning to Correlate in Multi-Player General-Sum Sequential Games | Sequential Equilibria | Celli et al. '19 | ~ |
No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium | Sequential Equilibria | Celli et al. '20 | Y |
Coarse Correlation in Extensive-Form Games | Sequential Equlibria | Farina et al. '20 | ~ |
Efficient Deviation Types and Learning for Hindsight Rationality | EFR | Morrill et al. '22 | N |
Counterfactual Regret Minimization (CFR) | Tabular | Zinkevich et al '08, Neller & Lanctot '13 | N |
Deep CFR | MARL | Brown et al. '18 | N |
Mastering Stratego | Applications (MARL) | Perolat et al. '22 | N |
Cicero: LLMs with strategic reasoning | Applications | FAIR '22 | ~ |
A Generalist Agent (Gato) | Applications | Reed et al. '22 | Y |
Superhuman AI for multiplayer poker | Applications (CFR) | Brown and Sandholm '19 | Y |
α-Rank | Eval. / Viz. | Omidhsafiei et al. '19, arXiv | N |
Nash Averaging | Eval. / Viz. | Balduzzi et al. '18 | N |
Paper title | theme | authors | ~ |
Other Resources