You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't remember the details, but they are designed to change as the total number of calls (k) increases. i.e. to decay. I think they are used in things like tabular td learning.
(Since they are Policys they should probably also have the action(p, s) function, though it's not immediately obvious how to do that for them.)
I think they would need to store k and the policy. They could have an update! function for k and the policy. The policy field could be P where P<:Union{Nothing,Policy} is a template parameter (nothing to use the current action interface).
The exploration policies (https://github.com/JuliaPOMDP/POMDPs.jl/blob/master/lib/POMDPTools/src/Policies/exploration_policies.jl) do not meet the
action
interface described in the documentationaction(::Policy, x)
and cannot be used with the simulators directly. Instead they have the interfaceaction(p::EpsGreedyPolicy, on_policy::Policy, k, s)
.I was wondering if there is a reason for this?
The text was updated successfully, but these errors were encountered: