Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two similar custom environments, PPO learns on both but SAC only on one #1824

Closed
5 tasks done
tfederico opened this issue Feb 2, 2024 · 1 comment
Closed
5 tasks done
Labels
custom gym env Issue related to Custom Gym Env No tech support We do not do tech support

Comments

@tfederico
Copy link

🐛 Bug

I have two custom environments, one with a hand and one with a humanoid with hands. In the past, I trained the humanoid with just PPO and the hand with both PPO and SAC.
I am currently trying to train the humanoid with SAC by adapting the training script I used for the hand. However, it does not converge at all. I am trying several hyperparams combinations, but I always get the same results no matter the combination. The maximum reward per step is 1 and the cumulative reward I am getting after 2000 steps is always 1. However, if I use the exact same environment but train using PPO instead of SAC, it works. Also, if I use the exact same training script with the hand environment, it works.

I took a look at this and this issues, as i noticed that for some runs the loss diverges.

Any idea about what could be wrong? To me, it looks like it might be related to SAC rather than the custom gym env, but I might be wrong...

Code example

Training script

Custom env humanoid

Relevant log output / Error message

No response

System Info

Describe the characteristic of your environment:

  • GPU model: RTX 2080Ti
  • Versions of any other relevant libraries: all indicated in requirements.txt file
python==3.9.12
gym==0.21.0
matplotlib==3.5.2
mpi4py
numpy==1.23.1
opencv-python==4.6.0.66
packaging==21.3
pandas==1.4.3
pybullet==3.2.5
pytorch3d==0.6.2
scipy==1.8.1
stable-baselines3==1.6.0
torch==1.11.0
torchvision==0.12.0
tqdm==4.64.0
wandb==0.13.1

Checklist

@tfederico tfederico added the custom gym env Issue related to Custom Gym Env label Feb 2, 2024
@tfederico tfederico changed the title Two similar custom environment, PPO learns on both but SAC only on one Two similar custom environments, PPO learns on both but SAC only on one Feb 2, 2024
@araffin araffin added the No tech support We do not do tech support label Feb 2, 2024
@araffin
Copy link
Member

araffin commented Feb 7, 2024

Hello,
your issue falls into the category of "tech support" (why X doesn't work on Y?) which we don't do (as mentioned in the readme and issue template), the rl discord, reddit or stack overflow are better places for such questions.

The only thing I can recommend you is to read/watch our documentation, especially the "rl tips and tricks" (see for instance #1826).

@araffin araffin closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
custom gym env Issue related to Custom Gym Env No tech support We do not do tech support
Projects
None yet
Development

No branches or pull requests

2 participants