Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSE computation issue #165

Open
svsawant opened this issue Sep 24, 2024 · 2 comments
Open

MSE computation issue #165

svsawant opened this issue Sep 24, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@svsawant
Copy link
Collaborator

In the RL training pipeline (for SAC and PPO), during evaluation runs, there seems to be an issue with computed/tracked mse values. They neither match with mse in "info" from env.step nor with rmse results from directly policy evaluation through rl_experiment.sh (A deeper dive suggests issue in how mse is handled in "RecordEpisodeStatistics")

@adamhall
Copy link
Contributor

Thanks @svsawant, do you have a simple example we could to take a look?

@svsawant
Copy link
Collaborator Author

To replicate, consider the following test case. Train an RL controller with a quadrotor and go through the logs. Then, execute the trained policy using rl_experiment.sh which again prints out the run stats. The mse values from training run (after taking a square root) are higher than rmse values printed in policy execution.
Here's a test run I did for PPO with quadrotor (with attitude control interface).
Screenshot from 2024-09-28 11-34-56
Next, the run stats from the policy evaluation
Screenshot from 2024-09-28 11-40-29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants