A policy-indexed MDP is an MDP where the state transition matrix and reward function is redefined in terms of a policy function. The missing superscript indicates that the action taken by the agent is dictated by the policy.
A policy-indexed MDP is an MDP where the state transition matrix and reward function is redefined in terms of a policy function. The missing superscript indicates that the action taken by the agent is dictated by the policy.