[Question] Manually Controlling Actions During PPO Training #2014
Labels
check the checklist
You have checked the required items in the checklist but you didn't do what is written...
custom gym env
Issue related to Custom Gym Env
more information needed
Please fill the issue template completely
question
Further information is requested
❓ Question
Thank you very much for creating such an excellent tool. I am currently using the PPO algorithm in Stable-Baselines3 (SB3) for training in a custom environment. During this process, I encountered an issue that I would appreciate your guidance on.
When I call model.learn(total_timesteps=10e6), the PPO model blocks the current thread and focuses entirely on the learning process. However, this causes the communication within the environment to stop running during the training. I would like to manually control the actions during the training, similar to the following process:
Is there a way to continue training the PPO model while allowing manual control over the action selection, and keeping the environment’s communication running? Do you have any recommended solutions for this?
I greatly appreciate your time and any insights you can provide. Your work has been incredibly valuable, and I look forward to any suggestions you might have.
Checklist
The text was updated successfully, but these errors were encountered: