New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add tabular MARL algorithms including Nash-Q and Correlated-Q #821

Merged

OpenSpiel merged 8 commits into google-deepmind:master from rezunli96:test

May 23, 2022

Contributor

rezunli96 commented Apr 22, 2022

Hello, I am trying to implement classical tabular-based MARL algorithms (NashQ, Correlated-Q). I adapted the code in tabular_qlearner and several other utility functions in openspiel. Please let me know if there is anything issue. Thanks!

rezunli96 added 4 commits

April 22, 2022 15:52


          add tabular marl

0a5f656


          add MARL algorithms

1c8c3ed


          add MARL algorithms

ea735e2


          add MARL algorithms

ec1f40e

Collaborator

lanctot commented Apr 23, 2022

The tests are failing for the same mysterious reasons as the ones in #818 (which are likely unrelated to your changes). I think you will have to wait until we merge #807 which upgrades the JAX dependency versions. I'm on vacation at the moment, but will import and merge that PR next week when I'm back and will let you know, then you can merge with master and the tests will likely pass.

Collaborator

lanctot commented Apr 26, 2022

Hi @rezunli96, the CI tests are now fixed on the master branch. Can you pull changes and push the merge to trigger a new invocation of the tests? Thanks!

Collaborator

lanctot commented Apr 26, 2022

Also can you edit the name of this PR to better reflect what algorithms it's adding? Is it tabular CE-Q and/or NashQ ?


          add MARL algorithms

ecb5bfa

Contributor Author

rezunli96 commented Apr 26, 2022 •

edited

Loading

Thank you!

@lanctot The check is done. Yes it is trying to add general joint-action q-learner including Nash-Q and Correlated-Q.

rezunli96 changed the title ~~Add tabular MARL algorithm~~ Add tabular MARL algorithm including Nash-Q and Correlated-Q

rezunli96 changed the title ~~Add tabular MARL algorithm including Nash-Q and Correlated-Q~~ Add tabular MARL algorithms including Nash-Q and Correlated-Q

lanctot requested changes

View reviewed changes

Collaborator

lanctot left a comment

Apologies for the delay! I left some initial comments, I'll check again later.

open_spiel/python/algorithms/tabular_maqlearner.py Outdated Show resolved Hide resolved

open_spiel/python/algorithms/tabular_maqlearner.py Outdated

+              from open_spiel.python import rl_agent
+              from open_spiel.python import rl_tools
+              from open_spiel.python.algorithms.jpsro import _mgce, _mgcce
+              from open_spiel.python.examples.matrix_nash import lemke_howson_solve

Collaborator

lanctot May 9, 2022

We should not be importing from examples. Can you refactor the example to move lemke_howson_solve to its own file? (Maybe python/algorithms/lemke_howson.py)

Contributor Author

rezunli96 May 9, 2022

Sure! Currently I made a file algorithms/matrix_nash and modified the original example/matrix_nash to an _example file so that we can import lemke_howson from other places.

open_spiel/python/algorithms/tabular_maqlearner.py Outdated

+                def __call__(self, q_tables, action_dims):
+                  '''
+                    Args:
+                      q_tables: a list of Q tables for each agent

Collaborator

lanctot May 9, 2022

What is the type of the q table?

Any reason to not have not have these be 3-dimensional (player, row, col) numpy array where the indices correspond to indices in each player's legal moves list? That's how we normally do it elsewhere, and we would not have to pass in action_dimcs and it's generally easier to work with numpy arrays

Contributor Author

rezunli96 May 9, 2022

yes it makes sense. Now I just directly make this action_dims as an explicit input argument for the MAQagent (num_of_actions, which is a list now) so a JointActionSolver won't need it

open_spiel/python/algorithms/tabular_maqlearner.py Outdated Show resolved Hide resolved


          refactor lemke-howson; delete action_dims in interface

011f78e

lanctot requested changes

View reviewed changes

Collaborator

lanctot left a comment

Thanks. Looking good! I added a few more comments.

Two other things:

Can you rename tabular_maqlearner -> tabular_joint_action_learner (or tabular_multiagent_qlearner, or tabular_markov_qlearner)
Can you add a small test (something similar to the example). It can be a very simple pathfinding instance, like a 3x3 map, and just verifies that the agents learn to go to their destinations after 100 or so episodes?

Should be ok to import after these two and other comments

open_spiel/python/algorithms/tabular_maqlearner.py Outdated Show resolved Hide resolved

open_spiel/python/examples/marl_example.py Outdated Show resolved Hide resolved

open_spiel/python/examples/marl_example.py Outdated Show resolved Hide resolved

open_spiel/python/examples/marl_example.py Show resolved Hide resolved

Collaborator

lanctot commented May 9, 2022

Can I ask for one more test on rock paper scissors to check that it computes a mixed strstegy and gets near the equilibrium value?

rezunli96 added 2 commits

May 9, 2022 17:19


          change names; add test; stylized

6bd5acf


          add rps test

b3a0f9a

Contributor Author

rezunli96 commented May 9, 2022

Sure! Just added a rps test to check whether Nash-Q learners learned the correct equilibrium and values.

lanctot approved these changes

View reviewed changes

lanctot added the imported label

OpenSpiel merged commit b1468a1 into google-deepmind:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels