Meaning of keyword abbreviations #56

zichunxx · 2024-06-01T08:46:11Z

Hi!

I'm new to model-based reinforcement learning and thanks for your contribution to Pytorch users.

I tried to read your code to understand the logistics of dreamerv3 but found some details are not mentioned in the original paper, especially the abbreviations of some keywords.

For example,

dreamerv3-torch/networks.py

Lines 174 to 179 in 4e50f30

    
           def obs_step(self, prev_state, prev_action, embed, is_first, sample=True): 
        
               # initialize all prev_state 
        
               if prev_state == None or torch.sum(is_first) == len(is_first): 
        
                   prev_state = self.initial(len(is_first)) 
        
                   prev_action = torch.zeros((len(is_first), self._num_actions)).to( 
        
                       self._device

the prev_state is a dict including three keys, i.e., logit, stoch, and deter.

What do they mean and where can I find a more specific explanation?

Thanks for your time!

The text was updated successfully, but these errors were encountered:

NM512 · 2024-09-26T14:59:59Z

Thanks for asking!

Each of them means the following

deter: deterministic state that is mentioned as h_t in the paper
logit: stoch before processing with softmax function
stoch: stochastic discrete state that is mentioned as z_t in the paper

If you have any questions, please ask more!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meaning of keyword abbreviations #56

Meaning of keyword abbreviations #56

zichunxx commented Jun 1, 2024 •

edited

Loading

NM512 commented Sep 26, 2024

Meaning of keyword abbreviations #56

Meaning of keyword abbreviations #56

Comments

zichunxx commented Jun 1, 2024 • edited Loading

NM512 commented Sep 26, 2024

zichunxx commented Jun 1, 2024 •

edited

Loading