Security Memo

Search (Ctrl+K)

Recent Notes

SMART
Oct 22, 2024
- software
Bossa Nova
Oct 22, 2024
ZFS
Oct 22, 2024
post-rock
Oct 06, 2024
- concept
2024-09-27
Sep 27, 2024
- daily

See 1423 more sorted by tag →

Home

❯

optimal action-value function

created Jan 08, 2024updated May 17, 20241 min read

concept

The optimal action-value function $q_{*} (s, a)$ is the maximum action-value function over all policies. In other words, $q_{*} (s)$ is the resulting action-value function when the agent follows the optimal policy (treat $π$ as a parameter in the equation below).

q_{*} (s, a) = π max q_{π} (s, a)

Graph View

Sources

Reinforcement learning

Backlinks

optimal policy

GitHub
LinkedIn
HackTheBox

Security Memo

Recent Notes

SMART

Bossa Nova

ZFS

post-rock

2024-09-27

optimal action-value function

Graph View

Sources

Backlinks