Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Port HIL-SERL] Fixes for SAC, tested with real robot #619

Draft
wants to merge 18 commits into
base: user/michel-aractingi/2024-11-27-port-hil-serl
Choose a base branch
from

Conversation

helper2424
Copy link
Contributor

WIP
It are fixes based on #616

What this does

Explain what this PR does. Feel free to tag your PR with the appropriate label(s).

Examples:

Title Label
Fixes #[issue] (🐛 Bug)
Adds new dataset (🗃️ Dataset)
Optimizes something (⚡️ Performance)

How it was tested

Explain/show how you tested your changes.

Examples:

  • Added test_something in tests/test_stuff.py.
  • Added new_feature and checked that training converges with policy X on dataset/environment Y.
  • Optimized some_function, it now runs X times faster than previously.

How to checkout & try? (for the reviewer)

Provide a simple way for the reviewer to try out your changes.

Examples:

pytest -sx tests/test_stuff.py::test_something
python lerobot/scripts/train.py --some.option=true

SECTION TO REMOVE BEFORE SUBMITTING YOUR PR

Note: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

Note: Before submitting this PR, please read the contributor guideline.

helper2424 and others added 18 commits January 4, 2025 17:56
- Introduced target critic networks in SACPolicy to enhance stability during training.
- Updated TD target calculation to incorporate entropy adjustments, improving robustness.
- Increased online buffer capacity in configuration from 10,000 to 40,000 for better data handling.
- Adjusted learning rates for critic, actor, and temperature to 3e-4 for optimized training performance.

These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.
@helper2424 helper2424 changed the base branch from user/michel-aractingi/2024-11-27-port-hil-serl-backup to user/michel-aractingi/2024-11-27-port-hil-serl January 15, 2025 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants