Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiCategorial distribution #1

Closed
breichholf opened this issue Jan 29, 2020 · 3 comments
Closed

MultiCategorial distribution #1

breichholf opened this issue Jan 29, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@breichholf
Copy link

Hi,

I wanted to take the TF2 version of PPO for a spin. My custom env makes use of a MultiDiscrete action space, but it isn't implemented here yet. While I'm still only in my early days of ML and tensorflow coding, I noticed that this tf2-branch is now making use of the Sequential and Keras API, rather than implementing neural cells outright.

I'm in no rush and quite happily using the current version of stable-baselines, nevertheless I wanted to ask if I could contribute in any way. While I'd like to say "I would like to implement MultiCategorialDistribution" my skills aren't quite there yet. Either way, things I would probably want to take a stab at first are: MultiCategorialDistribution and implementing LSTM cells on the policy side.

That being said, I'm also happy to just help out on the doc side.

Cheers,
Brian

@araffin
Copy link
Member

araffin commented Jan 29, 2020

Hello,
Thanks for proposing your help, I would recommend you to first read the two issues related to V3:

MultiCategorialDistribution

This would be a good contribution but not the focus right now (in fact, as we have already CategoricalDistribution working, it should be pretty straightforward)

implementing LSTM cells

This is much more tricky and won't be tackled before v3.1 (because it will also add a lot of complexity)

That being said, I'm also happy to just help out on the doc side.

This would be very helpful. Especially helping typing everyting. But for that, you can already contribute to stable-baselines v2.x ;) (we will sync changes at some point)

For now, I plan to rewrite huge chunk of the current draft until we find an agreement between the maintainers for the basic design, until then I would mostly only accept feedbacks (in this issue hill-a/stable-baselines#576) for improving/simplifying the code and feature request that should be in v3.0 (for more complex/non critical feature, we will wait v3.1)

@araffin araffin added the enhancement New feature or request label Jan 29, 2020
@breichholf
Copy link
Author

Cool, sounds great.

Yeah, I figured LSTMs wouldn't be a simple drop in, and that MultiCategorial would be the easier starting point. I'll check out the specific issues you mentioned and keep following along then.

@araffin
Copy link
Member

araffin commented Jun 9, 2020

Closing this issue as the beta of V3 is out: hill-a/stable-baselines#733
and that we are now using PyTorch instead.

V3 repo: https://github.com/DLR-RM/stable-baselines3

@araffin araffin closed this as completed Jun 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants