v0.0.4
Enhancement
- add agent configurations & polish replay video saving method (#184)
- polish comments in worker files
- polish comments in tree search files (#185)
- rename mcts_mode to battle_mode_in_simulation_env, add sampled alphazero config for tictactoe (#179)
- polish redundant data squeeze operations (#177)
- polish the continuous action process in sez model
- polish bipedalwalker env
Fix
- fix completed value inf bug when zero exists in action_mask in gumbel muzero (#178)
- fix render settings when using gymnasium (#173)
- fix lstm_hidden_size in sampled_efficientzero_model.py
- fix action_mask in bipedalwalker_cont_disc_env, fix device bug in sampled efficientzero (#168)
Full Changelog: v0.0.3...v0.0.4
Contributors: @karroyan @HarryXuancy @puyuan1996 @zjowowen