[Port HIL-SERL] Fixes for SAC, tested with real robot #619

helper2424 · 2025-01-07T16:48:38Z

WIP
It are fixes based on #616

What this does

Explain what this PR does. Feel free to tag your PR with the appropriate label(s).

Examples:

Title	Label
Fixes #[issue]	(🐛 Bug)
Adds new dataset	(🗃️ Dataset)
Optimizes something	(⚡️ Performance)

How it was tested

Explain/show how you tested your changes.

Examples:

Added test_something in tests/test_stuff.py.
Added new_feature and checked that training converges with policy X on dataset/environment Y.
Optimized some_function, it now runs X times faster than previously.

How to checkout & try? (for the reviewer)

Provide a simple way for the reviewer to try out your changes.

Examples:

pytest -sx tests/test_stuff.py::test_something

python lerobot/scripts/train.py --some.option=true

SECTION TO REMOVE BEFORE SUBMITTING YOUR PR

Note: Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR. Try to avoid tagging more than 3 people.

Note: Before submitting this PR, please read the contributor guideline.

…loss computation

…e implementation, added debug logging info

- Introduced target critic networks in SACPolicy to enhance stability during training. - Updated TD target calculation to incorporate entropy adjustments, improving robustness. - Increased online buffer capacity in configuration from 10,000 to 40,000 for better data handling. - Adjusted learning rates for critic, actor, and temperature to 3e-4 for optimized training performance. These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.

…robot into hil_serl_check_sac

helper2424 and others added 18 commits January 4, 2025 17:56

Last configs

a51cc67

Last changes

f3f936c

1, add input normalization in configuration_sac.py 2, add masking on …

77a7f92

…loss computation

use mean instead of sampled action for the inference

f1f04eb

fix the bug of target critic updates, roll back to origial temperatur…

eec28ba

…e implementation, added debug logging info

Potential fixes for SAC instability and NAN bug

db3925d

improvements from JClinton to speed up loading offline data

8b70b12

remove unused debug lines

89d8189

Fixup

a8ab76c

Merge branch 'new_port_hil_serl' of https://github.com/Ke-Wang1017/le…

fb62a0a

…robot into hil_serl_check_sac

Add fixes

959df13

Last fixes

3ebebdb

Fixup

c91c6cf

Decrease batch size

6f77a9f

Fixup

0fc3cfb

Fixup

5f1cbe0

Fix SAC

665fef2

helper2424 changed the base branch from user/michel-aractingi/2024-11-27-port-hil-serl-backup to user/michel-aractingi/2024-11-27-port-hil-serl January 15, 2025 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Port HIL-SERL] Fixes for SAC, tested with real robot #619

[Port HIL-SERL] Fixes for SAC, tested with real robot #619

helper2424 commented Jan 7, 2025

[Port HIL-SERL] Fixes for SAC, tested with real robot #619

Are you sure you want to change the base?

[Port HIL-SERL] Fixes for SAC, tested with real robot #619

Conversation

helper2424 commented Jan 7, 2025

What this does

How it was tested

How to checkout & try? (for the reviewer)

SECTION TO REMOVE BEFORE SUBMITTING YOUR PR