Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meaning of keyword abbreviations #56

Open
zichunxx opened this issue Jun 1, 2024 · 1 comment
Open

Meaning of keyword abbreviations #56

zichunxx opened this issue Jun 1, 2024 · 1 comment

Comments

@zichunxx
Copy link

zichunxx commented Jun 1, 2024

Hi!

I'm new to model-based reinforcement learning and thanks for your contribution to Pytorch users.

I tried to read your code to understand the logistics of dreamerv3 but found some details are not mentioned in the original paper, especially the abbreviations of some keywords.

For example,

dreamerv3-torch/networks.py

Lines 174 to 179 in 4e50f30

def obs_step(self, prev_state, prev_action, embed, is_first, sample=True):
# initialize all prev_state
if prev_state == None or torch.sum(is_first) == len(is_first):
prev_state = self.initial(len(is_first))
prev_action = torch.zeros((len(is_first), self._num_actions)).to(
self._device

the prev_state is a dict including three keys, i.e., logit, stoch, and deter.

What do they mean and where can I find a more specific explanation?

Thanks for your time!

@NM512
Copy link
Owner

NM512 commented Sep 26, 2024

Thanks for asking!

Each of them means the following

  • deter: deterministic state that is mentioned as h_t in the paper
  • logit: stoch before processing with softmax function
  • stoch: stochastic discrete state that is mentioned as z_t in the paper

If you have any questions, please ask more!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants