Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] In train.py, why is gamma in VecNormalize not updated per trial? #91

Open
liyan2015 opened this issue Jul 3, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@liyan2015
Copy link

Hi, from this issue, it says VecNormalize's gamma should match the gamma of RL algorithm (e.g., gamma=0.99 should be consistent in both PPO2 and VecNormalize) to ensure consistent sliding window size. However, it seems the normalization arguments used in create_env are always the default one read from .yml file (i.e., gamma=0.99 as default):

env = VecNormalize(env, **normalize_kwargs)

although gamma has different candidates in hyperparams_opt.py:

gamma = trial.suggest_categorical('gamma', [0.9, 0.95, 0.98, 0.99, 0.995, 0.999, 0.9999])

The same applies for rl-baselines3-zoo. Is this a bug? Should create_env consider gamma change in initiating VecNormalize per trial? Please give me some hint if I missed anything, thank you!

@araffin
Copy link
Owner

araffin commented Jul 6, 2020

Good point.

Overall, it should not make a big difference as the main point is to normalize the reward magnitude.
But for consistency, I agree that gamma should be updated.

Related: hill-a/stable-baselines#698

@araffin araffin added the enhancement New feature or request label Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants