-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
value(policy, b, a) for AlphaVectorPolicy #504
Comments
|
We could use
but returning |
I guess maybe |
I think this is problematic even in the case where every action has an alpha vector: For example in TigerPOMDP, consider a belief I think to implement this correctly a 1-step lookahead for |
A single step look-ahead might be a viable way to solve the no-alpha-vector-for-this-action problem, but it would not make |
I agree with everything you say, |
I concede your point that there is no objective theoretical definition of
Also, when Maxime originally implemented |
This should be implemented, but is not. One question is what to return for an action that doesn't have an alphavector. Should it be
missing
or-Inf
? Probablymissing
. If we make that decision,actionvalues
should probably also be updated.Relevant to #492 (reply in thread)
The text was updated successfully, but these errors were encountered: