Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textbook 10 #166

Open
xiaowuhu opened this issue Jun 13, 2022 · 1 comment
Open

textbook 10 #166

xiaowuhu opened this issue Jun 13, 2022 · 1 comment
Assignees

Comments

@xiaowuhu
Copy link
Collaborator

  • 以图10.1.1举例,这里没有a,s,r的符号,读者无法和后面的解释对应起来
  • 10.1.2 的Gt公式是可以有展开形式的,如果写出来会更容易理解
  • $\gamma$ 我记得是 (0,1], 不是 [0,1)
  • $\pi (a|s)=p(a_t=a|s_t=s)$ 这类的公式(我认为的)标准写法是 $\pi (a|s)=p(A_t=a|S_t=s)$
  • 10.1 参考文献内容太长,与正文不成比例
  • 10.1.2 可能需要一个三个圆叠加的图来展示三者的关系
  • 10.1.1 和 10.1.2 请放在一个md文件中
  • 10.2.2 中为什么会有四级索引序号的图?比如 图10.2.2.1,可以改成 10.2.1
  • 10.2.2 和 10.2.3 可以考虑合并为 10.3,这样篇幅和内容上都可以和 10.2.1(变成10.2)来匹配
@xuehui1991
Copy link
Contributor

xuehui1991 commented Jul 13, 2022

Thanks for your comments.
As for the the range of gamma, I check the the book[1] and this book define the gamma ~[0, 1].
However, setting the value of gamma as 1 will make the model hard to converge.
You could reference this discussion .
Therefore, I will reference the book[1] and revise this part.

  • [1] Sutton R S, Barto A G. Reinforcement learning: An introduction[M]. MIT press, 2018.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants