Skip to content

Commit

Permalink
20240412
Browse files Browse the repository at this point in the history
  • Loading branch information
BlitherBoom812 committed Apr 12, 2024
1 parent 1eda7a3 commit 500d6ef
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions source/_posts/DRL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,21 @@ katex: true
date: 2024-04-10 13:34:06
tags:
---
## Policy Graident

带权重的梯度下降方法

$$
\nabla_\theta J(\theta)=\mathbb{E}_{\pi_\theta}[\nabla_\theta\log\pi_\theta(a_t|s_t)R(\tau)]
$$

## A2C

$$
\Delta\theta=\alpha\nabla_\theta(log\pi_\theta(s,a))\hat{q}_w(s,a)\\
\Delta w=\beta\left(R(s,a)+\gamma\hat{q}_{w}(s_{t+1},a_{t+1})-\hat{q}_{w}(s_{t},a_{t})\right)\nabla_{w}\hat{q}_{w}(s_{t},a_{t})\\
$$

## Model-Based RL

### Model-Based RL
Expand Down
Binary file added source/images/DRL/1712925351346.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/images/DRL/1712925354266.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 500d6ef

Please sign in to comment.